DISCRETE and
     COMBINATORIAL
WATE Bs OS Uae 8 Ot aed

im   aYi   ep   1"

Ralph P. Grimaldi    any
       DISCRETE
         AND
    COMBINATORIAL
     MATHEMATICS
        An Applied Introduction
                    FIFTH EDITION

RALPH P. GRIMALDI
       Rose-Hulman Institute of Technology

A
                        vv
                    PEARSON
                  ee

File a
                         Aer (ent

Boston San Francisco New York
      London Toronto Sydney Tokyo Singapore Madrid      >
Mexico City Munich Paris Cape Town Hong Kong Montreal       s%
Publisher:                                                 Greg Tobin
Senior Acquisitions Editor:                                William Hoffman
Assistant Editor:                                          RoseAnne Johnson
Executive Marketing Manager:                               Yolanda Cossio
Senior Marketing Manager:                                   Pamela Laskey
Marketing Assistant:                                       Heather Peck
Managing Editor:                                           Karen Guardino
Senior Production Supervisor:                              Peggy McMahon
Senior Manufacturing Buyer:                                Hugh Crawford
Composition and Technical Art Rendering:                   Techsetters, Inc.
Production Services:                                       Barbara Pendergast
Design Supervisor:                                         Barbara T. Atkinson
Cover Designer:                                            Dennis Schaefer
Photo Research and Design Specifications:                  Beth Anderson
Cover Illustration:                                        George V. Kelvin

Photographs of Blaise Pascal, Aristotle, Lord Bertrand Arthur William Russell, Euclid, Au-
gusta Ada Byron (Countess of Lovelace), Gottfried Wilhelm Leibniz, Carl Friedrich Gauss,
Leonhard Euler, Arthur Cayley, Pierre de Fermat, Niels Henrik Abel, and Evariste Galois
are reproduced courtesy of the Bettman Archive (Corbis). Photographs of George Boole,
Peter Gustav Lejeune Dirichlet, David Hilbert, Giuseppe Peano, James Joseph Sylvester,
Sophie Germain, and Emmy Noether are reproduced courtesy of Historical Pictures/Stock
Montage. The photograph of Claude Elwood Shannon is reproduced courtesy of the MIT
Museum. The photograph of Edsger W. Dijkstra is reproduced courtesy of the University of
Texas at Austin. The photographs of Andrew John Wiles and Rear Admiral Grace Murray
Hopper are reproduced courtesy of AP/Wide World. The photographs of Georg Cantor,
Alan Mathison Turing, William Rowan Hamilton, and Leonardo Fibonacci are reproduced
courtesy of The Granger Collection. The photograph of Paul Erdés is reproduced courtesy
of Christopher Barker. The photographs of Andrei Nikolayevich Kolmogorov, Thomas
Bayes, and Al-Khowarizmi are reproduced courtesy of the St. Andrews University Mac-
Tutor Archive. The photograph of David A. Huffman is reproduced courtesy of Manuel
Enrique Bermudez of the Department of Computer and Information Science and Engi-
neering at the University of Florida. The photograph of Joseph P. Kruskal is reproduced
courtesy of Leiden University.

Library of Congress Cataloging-in-Publication Data

Grimaldi, Ralph P.
  A review of discrete and combinatorial mathematics / by Ralph P. Grimaldi.—Sth ed.
     p. cm.
  Includes index.
  Rev. ed of: Discrete and combinatorial mathematics,    c1999.
  ISBN 0-201-72634-3
   1. Mathematics. 2. Computer science-Mathematics. 3. Combinatorial analysis. 1.
  Grimaldi, Ralph P. Discrete and combinatorial mathematics. II. Title.

QA39.2.G748 2003
  510-de21
                                                                               2002038383

ISBN 0-201 -72634-3

permission of the publisher. Printed in the United States of America.

123456789
    10— CRW — 0504030202
NOTATION
      LOGIC   P.q         statements (or propositions)
                Pp        the negation of (statement) p: not p
              DAG         the conjunction of p, g: p and q
              pV@         the disjunction of p, g: p or q
              p->@        the implication of g by p: p implies q
              pq          the biconditional of p and q: p if and only if q
              iff         if and only if
              p>4q        logical implication: p logically implies q
              pod         logical equivalence: p is logically equivalent to q
              To          tauology
              Fo          contradiction
              Vx          For all x (the universal quantifier)
              dx          For some x (the existential quantifier)

SET THEORY   xéEA        element x is a member of set A
              xGA         element x is not a member of set A
              OU          the universal set
              ACB,BDA     A is a subset of B
              ACB,BDA     A is a proper subset of B
              AZB         A is not a subset of B
              ACB         A is not a proper subset of B
              [A]         the cardinality, or size, of set A — that is, the number of elements in A
              B={}        the empty, or null, set
              P(A)        the power set of A — that is, the collection of all subsets of A
              ANB         the intersection of sets A, B: {x|x € A and x € B}
              AUB         the union of sets A, B: {x|x € Aorx € B}
              AAB         the symmetric difference of sets A, B:
                             {x|x € Aorx € B, butx ¢ AM B}
                          the complement of set A: {x|x € U and x ¢ A}
                          the (relative) complement of set B in set A: {x|x € A and x ¢ B}
                          {x|x € A,, for at least one i € I}, where J is an index set
                          {x|x € A,, for every i € J}, where J is an index set

PROBABILITY               the sample space for an experiment @
                          A is an event
                          the probability of event A
                          the probability of A given B; conditional probability
                          random variable
                          the expected value of X, a random variable
                          the variance of X, a random variable
                          the standard deviation of X, a random variable

NUMBERS    alb         a divides b, fora, be Z,a #0
              afb         a does not divide b, for a, b € Z,a   #0
              gcd(a, b)   the greatest common   divisor of the integers a, b
              Icm(a, b)   the least common multiple of the integers a, b
              $ (n)       Euler’s phi function for n € Zt
               [x]        the greatest integer less than or equal to the real number x:
                             the greatest integer in x: the floor of x
NOTATION
               [x]                        the smallest integer greater than or equal to the real number x:
                                             the ceiling of x
               a =b(modn)                 a is congruent to b modulo n

RELATIONS    AXB                        the Cartesian, or cross, product of sets A, B:
                                             {(a, b)ja € A, b & B}
               RCOAXB                     SR is arelation from A to B
               aRb; (a, byEeR             a is related to b
               afb;       (a, byeR        a is not related to b
               GRE
                                          the converse of relation &: (a, b) € AK iff (b, a) E RS
               Roy                        the composite relation for RA CAX B,PSCBXC:
                                             (a,c) ERoF      if (a, b) eR, (b,c) € F forsomebeB
               lub{a, b}                  the least upper bound of a and b
               glb{a, b}                  the greatest lower bound of a and b
               [a]                        the equivalence class of element a (relative to an
                                             equivalence relation R on a set A): {x € Alx Ra}

FUNCTIONS    f:A>B                      f is a function from A to B
               F(A)                       for f: A—> Band A; CA, f(A}) is the image of A
                                             under f — that is, { f(a)|a € A,}
               F(A)                       for f: A— B, f(A) is the range of f
               f:AXA—>B                   f is a binary operation on A
               f:AXA—            B(CA)    f is aclosed binary operation on A
               ly:     AoA                the identity function on A: 14(a) =a foreachae A
               flay                       the restriction of f: A>       Bto A; CA
               gos                        the composite function for f: A >             B,g:B—->C:

fi
                                              (g° f)a = g(f(a)),
                                                            forae A
                                          the inverse of function f
               f-'(B))                    the preimage of B, C B for f: A—>             B
               f € O(g)                   f is “big Oh” of g; f is of order g

THE ALGEBRA    x                          a finite set of symbols called an alphabet
  OF STRINGS   Xr                         the empty string
               thal}                      the length of string x
               ¥r"
                                          {xjxX2 +++ Xnlx, € L},n Ee Zt
               yo
                                          {A}
               yt
                                          U ez+ x”: the set of all strings of positive length
               y*
                                          U      20 =”: the set of all finite strings
               Acz*                       A is a language
               AB                         the concatenation of languages A, B C *:
                                            {abla e A, be B}
               A”
                                          {a\a2---a,|a,€ ACXU*},              ne Zt
               Ae
                                          {A}
               At
                                          Un ez+A"
               A*®
                                           Uso A": the Kleene closure of language A
               M       =(S,,0,    v, w)   a finite state machine M with internal states S$, input
                                                alphabet Y, output alphabet ©, next state function
                                                v: S X F + S and output function w: SX f > O
Preface

[: has been more than twenty years since September 2, 1982, when I signed the contract
      to develop what turned into the first edition of this present textbook. At that time the
    idea of further editions never crossed my mind. Consequently, I continue to find myself
    simultaneously very humbled and very pleased with the way this textbook has been received
    by so many instructors and especially students. The first four editions of this textbook have
    found their way into many colleges and universities here in the United States. They have
    also been used in other nations such as Australia, Canada, England, Ireland, Japan, Mexico,
    the Netherlands, Scotland, Singapore, South Africa, and Sweden. I can only hope that this
    fifth edition will continue to enlighten and challenge all those who wish to learn about some
    of the many facets of the fascinating area of mathematics called discrete mathematics.
        The technological advances of the last four decades have resulted in many changes
    in the undergraduate curriculum. These changes have fostered the development of many
    single-semester and multiple-semester courses where some of the following are introduced:

1. Discrete methods that stress the finite nature inherent in many problems and structures;
       2. Combinatorics — the algebra of enumeration, or counting, with its fascinating inter-
       relations with so many finite structures;
       3. Graph theory with its applications and interrelations with areas such as data structures
       and methods of optimization; and
       4. Finite algebraic structures that arise in conjunction with disciplines such as coding
       theory, methods of enumeration, gating networks, and combinatorial designs.

A primary reason for studying the material in any or all of these four major topics is the
    abundance of applications one finds in the study of computer science      — especially in the
    areas of data structures, the theory of computer languages, and the analysis of algorithms.
    In addition, there are also applications in engineering and the physical and life sciences, as
   well as in statistics and the social sciences. Consequently, the subject matter of discrete and
   combinatorial mathematics provides valuable material for students in many majors          — not
   just for those majoring in mathematics or computer science.
       The major purpose of this new edition is to continue to provide an introductory survey
   in both discrete and combinatorial mathematics. The coverage is intended for the beginning
    student, so there are a great number of examples with detailed explanations. (The examples
    are numbered separately and a thick line is used to denote the end of each example.) In
    addition, wherever proofs are given, they too are presented with sufficient detail (with the
    novice in mind).
Preface

The text strives to accomplish the following objectives:
            1. To introduce the student at the sophomore-junior level, if not earlier, to the topics and
            techniques of discrete methods and combinatorial reasoning. Problems in counting, or
            enumeration, require a careful analysis of structure (for example, whether or not order
            and repetition are relevant) and logical possibilities. There may even be a question of
            existence for some situations. Following such a careful analysis, we often find that the
            solution of a problem requires simple techniques for counting the possible outcomes that
            evolve from the breakdown of the given problem into smaller subproblems.
            2. To introduce a wide variety of applications. In this regard, whenever data structures
            (from computer science) or structures from abstract algebra are required, only the basic
            theory needed for the application is developed. Furthermore, the solutions of some ap-
            plications lend themselves to iterative procedures that lead to specific algorithms. The
            algorithmic approach to the solution of problems is fundamental in discrete mathemat-
            ics, and this approach reinforces the close ties between this discipline and the area of
            computer science.
            3. To develop the mathematical maturity of the student through the study of an area that
            is so different from the traditional coverage in calculus and differential equations. Here,
            for example, there is the opportunity to establish results by counting a certain collection
             of objects in more than one way. This provides what are called combinatorial identities;
             it also introduces a novel proof technique. In this edition the nature of proof, along with
             what constitutes a valid argument, is developed in Chapter 2, in conjunction with the
             laws of logic and rules of inference. The coverage is extensive, keeping the student
             (with minimal background) in mind. [For the reader with a logic course (or something
             comparable) in his or her background, this material can be skipped over with little or
             no difficulty.] Proofs by mathematical induction (along with recursive definitions) are
             introduced in Chapter 4 and then used throughout the subsequent chapters.
                 With regard to theorems and their proofs, in many instances an attempt has been made
             to motivate theorems from observations on specific examples. In addition, whenever a
             finite situation provides a result that is not true for the infinite case, this situation is
             singled out for attention. Proofs that are extremely long and/or rather special in nature
             are omitted. However, for the very small number of proofs that are omitted, references are
             supplied for the reader interested in seeing the validation of these results. (The amount
             of emphasis placed on proofs will depend on the goals of the individual instructor and
             on those of his or her student audience.)
             4. To present an adequate survey of topics for the computer science student who will be
             taking more advanced courses in areas such as data structures, the theory of computer
             languages, and the analysis of algorithms. The coverage here on groups, rings, fields,
             and Boolean algebras will also provide an applied introduction for mathematics majors
             who wish to continue their study of abstract algebra.
              The prerequisites for using this book are primarily a sound background in high school
          mathematics and an interest in attacking and solving a variety of problems. No particular
          programming ability is assumed. Program segments and procedures are given in pseudo-
          code, and these are designed and explained in order to reinforce particular examples. With
          regard to calculus, we shall mention later in this preface its extent in Chapters 9 and 10.
              My primary motivation for writing the first four editions of this book has been the en-
          couragement I had received over the years from my students and colleagues, as well as from
          the students and instructors who used the first four editions of the textbook at many different
          colleges and universities. Those four editions reflected both my interests and concerns and
                                                                               Preface         vii

those of my students, as well as the recommendations of the Committee on the Undergrad-
    uate Program in Mathematics and of the Association of Computing Machinery. This fifth
    edition continues along the same lines, reflecting the suggestions and recommendations
    made by the instructors and especially the students who have used or are using the fourth
    edition.

Features

Following are brief descriptions of some of the major features of this newest edition. These
    are designed to assist the reader (student or otherwise) in learning the fundamentals of
    discrete and combinatorial mathematics.

Emphasis on algorithms and applications. Algorithms and applications in many areas
       are presented throughout the text. For example:
       1. Chapter | includes several instances where the introductory topics on enumeration
       are needed — one example, in particular, addresses the issue of over-counting.
       2. Section 7 of Chapter 5 provides an introduction to computational complexity. This
       material is then used in Section 8 of this chapter in order to analyze the running times of
       some elementary pseudocode procedures.
       3. The material in Chapter 6 covers languages and finite state machines. This introduces
       the reader to an important area in computer science — the theory of computer languages.
       4, Chapters 7 and 12 include discussions on the applications and algorithms dealing with
       topological sorting and the searching techniques known as the depth-first search and the
       breadth-first search.
       5. In Chapter 10 we find the topic of recurrence relations. The coverage here includes ap-
       plications on (a) the bubble sort, (b) binary search, (c) the Fibonacci numbers,
       (d) the Koch snowflake, (e) Hasse diagrams, (f) the data structure called the stack,
       (g) binary trees, and (h) tilings.
       6. Chapter 16 introduces the fundamental properties of the algebraic structure called
       the group. The coverage here shows how this structure is used in the study of algebraic
       coding theory and in counting problems that require Polya’s method of enumeration.
       Detailed explanations. Whether it is an example or the proof of a theorem, explana-
       tions are designed to be careful and thorough. The presentation is primarily focused on
       improving understanding on the part of the reader who is seeing this type of material for
       the first time.
       Exercises. The role of the exercises in any mathematics text is a crucial one. The amount
       of time spent on the exercises greatly influences the pace of the course. Depending on
       the interest and mathematical background of the student audience, an instructor should
       find that the class time spent on discussing exercises will vary.
           There are over 1900 exercises in the 17 chapters. Those that appear at the end of each
       section generally follow the order in which the section material is developed. These
       exercises are designed to (a) review the basic concepts in the section; (b) tie together
       ideas presented in earlier sections of the chapter; and (c) introduce additional concepts
       that are related to the material in the section. Some exercises call for the development
       of an algorithm, or the writing of a computer program, often to solve a certain instance
       of a general problem. These usually require only a minimal amount of programming
       experience.
viii   Preface

Each chapter concludes with a set of supplementary exercises. These provide further
                          review of the ideas presented in the chapter, and also use material developed in earlier
                          chapters.
                             Solutions are provided at the back of the text for almost all parts of all the odd-
                          numbered exercises.
                          Chapter summaries. The last numbered section in each chapter provides a summary
                          and historical review of the major ideas covered in that chapter. This is intended to give
                          the reader an overview of the contents of the chapter and provide information for further
                          study and applications. Such further study can be readily assisted by the list of references
                          that is supplied.
                              In particular, the summaries at the ends of Chapters 1, 5, and 9 include tables on the
                          enumeration formulas developed within each of these chapters. Sometimes these tables
                          include results from earlier chapters in order to make comparisons and to show how the
                          new results extend the prior ones.

Organization
                       The areas of discrete and combinatorial mathematics are somewhat new to the undergraduate
                       curriculum, so there are several options as to which topics should be covered in these courses.
                       Each instructor and each student may have different interests. Consequently, the coverage
                       here is fairly broad, as a survey course mandates. Yet there will always be further topics that
                       some readers may feel should be included. Furthermore, there will also be some differences
                       of opinion with regard to the order in which some topics are presented in this text.
                           The nature and importance of the algorithmic approach to problem solving is stressed
                       throughout the text. Ideas and approaches on problem solving are further strengthened by
                       the interrelations between enumeration and structure, two other major topics that provide
                       unifying threads for the material developed in the book.
                           The material is subdivided into four major areas. The first seven chapters form the
                       underlying core of the book and present the fundamentals of discrete mathematics. The
                       coverage here provides enough material for a one-quarter or one-semester course in discrete
                       mathematics. The material in Chapter 2 can be reviewed by those with a background in logic.
                       For those interested in developing and writing proofs, this material should be examined
                       very carefully. A second course —one that emphasizes combinatorics           — should include
                       Chapters 8, 9, and 10 (and, time permitting, sections 1, 2,3, 10, 11, and 12 of Chapter 16). In
                       Chapter 9 some results from calculus are used; namely, fundamentals on differentiation and
                       partial fraction decompositions. However, for those who wish to skip this chapter, sections
                       1, 2, 3, 6, and 7 of Chapter 10 can still be covered. A course that emphasizes the theory and
                       applications of finite graphs can be developed from Chapters 11, 12, and 13. These chapters
                       form the third major subdivision of the text. For a course in applied algebra, Chapters
                       14, 15, 16, and 17 (the fourth, and final, subdivision) deal with the algebraic structures
                                                                                                                —
                       group, ring, Boolean algebra, and field — and include applications on cryptology, switching
                       functions, algebraic coding theory, and combinatorial designs. Finally, a course on the role
                       of discrete structures in computer science can be developed from the material in Chapters
                       1], 12, 13, 15, and sections 1-9 of Chapter 16. For here we find applications on switching
                       functions, the RSA cryptosystem, and algebraic coding theory, as well as an introduction
                       to graph theory and trees, and their role in optimization.
                           Other possible courses can be developed by considering the following chapter depen-
                       dencies.
                                                                                                   Preface    IX

Chapter        Dependence on Prior Chapters
                      1         No dependence
                      2         No dependence (Hence an instructor can start a course in discrete mathe-
                                matics with either the study of logic or an introduction to enumeration.)
                                  , 2

Mm HW
                                1,2
                                  +   , 3

1,2, a) 3,4

Oo
                                 ,

ony
                                1, 2,       3 3, 5 (Minor dependence   in Section 6.1 on Sections 4.1, 4.2)
                     AN
                                        +

1, 2, 3, 5, 6 (Minor dependence in Section 7.2 on Sections 4.1, 4.2)
                                1, 3 (Minor dependence in Example 8.6 on Section 5.3)
                                1,3
                                1, 3, 4, 5, 9 (Minor dependence in Example 10.33 on Section 7.3)
                    —

1,2, 3,4, 5 (Although some graph-theoretic ideas are mentioned in Chapters
                    —
                    =

5,6, 7, 8, and 10, the material in this chapter is developed with no dependence
                                on the graph-theoretic material given in these earlier results.)
                    12          1,2, 3,4,5, 11
                    13          3,5, 11, 12
                    14          2,3, 4, 5, 7 (The Euler phi function (@) is used in Section 14.3. This function
                                is derived in Example 8.8 of Section 8.1 but the result can be used here in
                                Chapter 14 without covering Chapter 8.)
                    15          2,3,5,7
                    16           1, 2,3,4,5,7
                    17          2, 3,4, 5, 7, 14
                 In addition, the index has been very carefully developed in order to make the text even
              more flexible. Terms are presented with primary listings and several secondary listings.
              Also there is a great deal of cross referencing. This is designed to help the instructor who
              may want to change the order of presentation and deviate from the straight and narrow.

Changes in the Fifth Edition
              The changes here in the fifth edition of Discrete and Combinatorial Mathematics reflect
              the observations and recommendations of students and instructors who have used earlier
              editions of the text. As with the first four editions, the tone and purpose of the text remain
              intact. The author’s goal is still the same: to provide within these pages a sound, readable,
              and understandable introduction to the foundations of discrete and combinatorial mathe-
              matics — for the beginning student or reader. Among the changes one will find in this fifth
              edition we mention the following:
                @ The examples in Section 4 of Chapter 1 now include material on runs, a concept that
                arises in the study of statistics — in particular, in the area of quality control.
                e Exercise 13 for Section 3 of Chapter 2 develops the rule of inference known as reso-
                lution, a rule that serves as the basis for many computer programs designed to automate
                a reasoning system.
                e The earlier editions of this text included a section that introduced the notion of prob-
                ability. This section has now been expanded and three additional optional sections have
                been added for those who wish to further examine some of the introductory ideas as-
                sociated with discrete probability — in particular, the axioms of probability, conditional
                 probability, independence, Bayes’ Theorem, and discrete random variables.
Preface

@ The coverage on partial orders and total orders in Section 3 of Chapter 7 now includes
                  an optional example where the Catalan numbers arise in this context.
                  e The introductory material in Section 1] of Chapter 8 has been rewritten to provide
                  a more readable transition between the coverage on counting and Venn diagrams in Sec-
                  tion 3 of Chapter 3 and the more general technique known as the Principle of Inclusion
                  and Exclusion.
                  ® One of the fascinating features of discrete and combinatorial mathematics is the vari-
                  ety of ways a given problem can be solved. In the fourth edition (in Chapters 1 and 3)
                  the reader learned, in two different contexts, that a positive integer n had 2”—! compo-
                  sitions  — that is, there are 2”—! ways to write n as an ordered sum of positive-integer
                  summands. This result is now established in three other ways: (i) by the Principle of
                  Mathematical Induction in Chapter 4; (ii) using generating functions in Chapter 9; and
                  (iii) by solving a recurrence relation in Chapter 10.
                  e For those who want even more on discrete probability, Section 2 of Chapter 9 includes
                  an example that deals with the geometric random variable.
                  ® Section 2 of Chapter 10 now includes a discussion of the work by Gabrie] Lamé in
                  estimating the number of divisions used in the Euclidean algorithm to find the greatest
                  common divisor of two positive integers.
                  ¢ The Master theorem (of importance in the analysis of algorithms) is introduced and
                  developed in an exercise for Section 6 of Chapter 10.
                  e The material on transport networks (in Section 3 of Chapter 13) has been updated and
                  now incorporates the Edmonds-Karp algorithm in the procedure originally developed by
                  Lester Ford and Delbert Fulkerson.
                  e The coverage on modular arithmetic in Section 3 of Chapter 14 now includes applica-
                  tions dealing with the linear congruential pseudorandom number generator, private-key
                  cryptosystems, and modular exponentiation. Further, in Section 4 of Chapter 14, the ma-
                  terial dealing with the Chinese Remainder Theorem, which was only stated in previous
                  editions, now includes a proof of this result as well as an example dealing with how it is
                  applied.
                  ® Section 4 of Chapter 16 is new and optional. The material here provides an introduction
                  to the RSA public-key cryptosystem and shows how one can apply some of the theoretical
                  results developed in prior sections of the text.
                  e As with the second, third, and fourth editions, a great deal of effort has been applied
                  in updating the summary and historical review at the end of each chapter. Consequently,
                  new references and/or new editions are provided where appropriate.
                  e For this fifth edition, the following pictures and photographs have been added to the
                  summary and historical review of certain chapters: a picture of Thomas Bayes and a pho-
                  tograph of Andrei Nikolayevich Kolmogorov in Chapter 3; a picture of Al-Khow4rizmi
                  in Chapter 4; a photograph of David A. Huffman in Chapter 12; and a photograph of
                  Joseph B. Kruskal in Chapter 13.

Ancillaries

e@ There is an /nstructor’s Solutions Manual     that is available, from the publisher, for
                  those instructors who adopt the textbook for their classes. It contains the solutions and/or
                  answers for all of the exercises within the 17 chapters and the three appendices of this
                  textbook.
                                                                                     Preface           xi

@ There is also a Student’s Solutions Manual that is available separately. It contains the
           solutions and/or answers for all of the odd-numbered exercises in the textbook. In some
           cases more than one solution is presented.
           e The following Web site provides additional resources for learning more about discrete
           and combinatorial mathematics. In addition it also provides a way for readers to contact
           the author with comments, suggestions, or possible errors they have found.

www.aw.com/grimaldi

Acknowledgments
        If space permitted, I should like to mention each of the students who provided help and
        encouragement when I was writing the five editions of this book. Their suggestions helped
        to remove many mistakes and ambiguities, thus improving the exposition. Most helpful
        in this category were Paul Griffith, Meredith Vannauker, Paul Barloon, Byron Bishop,
        Lee Beckham, Brett Hunsaker, Tom Vanderlaan, Michael Bryan, John Breitenbach, Dan
        Johnson, Brian Wilson, Allen Schneider, John Dowell, Charles Wilson, Richard Nichols,
        Charles Brads, Jonathan Atkins, Kenneth Schmidt, Donald Stanton, Mark Stremler, Stephen
        Smalley, Anthony Hinrichs, Kevin O’ Bryant, and Nathan Terpstra.
           I thank Larry Alldredge, Claude Anderson, David Rader, Matt Hopkins, John Rickert, and
        Martin Rivers for their comments on the computer science material, and Barry Farbrother,
        Paul Hogan, Dennis Lewis, Charles Kyker, Keith Hoover, Matthew Saltzman, and Jerome
        Wagner for their enlightening remarks on some of the applications.
            I gratefully acknowledge the persistent enthusiasm and encouragement of the staff at
        Addison-Wesley (both past and present), especially Wayne Yuhasz, Thomas Taylor, Michael
        Payne, Charles Glaser, Mary Crittendon, Herb Merritt, Maria Szmauz, Adeline Ruggles,
        Stephanie Botvin, Jack Casteel, Jennifer Wall, Joanne Sousa Foster, Karen Guardino, Peggy
        McMahon, Deborah Schneider, Laurie Rosatone, Carolyn Lee-Davis, and Jennifer Al-
        banese. William Hoffman, and especially RoseAnne Johnson and Barbara Pendergast, de-
        serve the most recognition for their outstanding contributions to this fifth edition. The efforts
        put forth by Steven Finch in proofreading the text and that of Paul Lorczak who checked
        the accuracy of the answers to the exercises are also greatly appreciated.
            I am also indebted to my colleagues John Kinney, Robert Lopez, Allen Broughton,
        Gary Sherman, George Berzsenyi, and especially Alfred Schmidt, for their interest and
        encouragement throughout the writing of this and/or earlier editions.
           Thanks and appreciation are due the following reviewers of the first, second, third, fourth,
        and/or fifth editions.
           Norma E. Abel                     Digital Equipment Corporation
           Larry Alldredge                   Qualcomm, Inc.
           Charles Anderson                  University of Colorado, Denver
           Claude W. Anderson III            Rose-Hulman Institute of Technology
           David Arnold                      Baylor University
           V. K. Balakrishnan                University of Maine at Orono
           Robert Barnhill                   University of Utah
           Dale Bedgood                      East Texas State University
           Jerry Beehler                     Tri-State University
           Katalin Bencsath                  Manhattan College
           Allan Bishop                      Western Illinois University
           Monte Boisen                      Virginia Polytechnic Institute
xii   Preface

Samuel Councilman      California State University at Long Beach
                Robert Crawford        Western Kentucky University
                Ellen Cunningham, SP   Saint Mary-of-the-Woods College
                Carl DeVito            Naval Postgraduate School
                Vladimir Drobot        San Jose State University
                John Dye               California State University at Northridge
                Carl Eckberg           San Diego State University
                Michael Falk           Northern Arizona University
                Marvin Freedman        Boston University
                Robert Geitz           Oberlin College
                James A. Glasenapp     Rochester Institute of Technology
                Gary Gordon            Lafayette College
                Harvey Greenberg       University of Colorado, Denver
                Laxmi Gupta            Rochester Institute of Technology
                Eleanor O. Hare        Clemson University
                James Harper           Central Washington University
                David S. Hart          Rochester Institute of Technology
                Maryann Hastings       Marymount College
                W. Mack Hill           Worcester State College
                Stephen Hirtle         University of Pittsburgh
                Arthur Hobbs           Texas A&M University
                Dean Hoffman           Auburn University
                Richard Iltis          Willamette University
                David P. Jacobs        Clemson University
                Robert Jajcay          Indiana State University
                Akihiro Kanamori       Boston University
                John Konvalina         University of Nebraska at Omaha
                Rochelle Leibowitz     Wheaton College
                James T. Lewis         University of Rhode Island
                Y-Hsin Liu             University of Nebraska at Omaha
                Joseph Malkevitch      York College (CUNY)
                Brian Martensen        The University of Texas at Austin
                Hugh Montgomery        University of Michigan
                Thomas Morley          Georgia Institute of Technology
                Richard Orr            Rochester Institute of Technology
                Edwin P. Oxford        Baylor University
                John Rausen            New Jersey Institute of Technology
                Martin Rivers          Lexmark International, Inc.
                Gabriel Robins         University of Virginia
                Chris Rodger           Auburn University
                James H. Schmer]       University of Connecticut
                Paul S. Schnare        Eastern Kentucky University
                Leo Schneider          John Carroll University
                Debra Diny Scott       University of Wisconsin at Green Bay
                Gary E. Stevens        Hartwick College
                Dalton Tarwater        Texas Tech University
                Jeff Tecosky-Feldman   Harvard University
                W. L. Terwilliger      Bowling Green State University
                Donald Thompson        Pepperdine University
                                                                             Preface        xiii

Thomas Upson            Rochester Institute of Technology
   W. D. Wallis            Southern Illinois University
   Larry West              Virginia Commonwealth University
   Yixin Zhang             University of Nebraska at Omaha
Special thanks are due to Douglas Shier of Clemson University for the outstanding work
he did in reviewing the manuscripts of all five editions. Thanks are also due to Joan Shier
for letting Doug review the fourth and fifth editions.
   The translation for the dedication is due to Dr. Yvonne Panaro of Northern Virginia
Community College. Thank you, Yvonne, and thank you, Patter (Patricia Wickes Thurston),
for your role in obtaining the translation.
    Atext of this length requires the use of many references. The members of the library staff
of Rose-Hulman Institute of Technology were always available when books and articles
were needed, so it is only fitting to express one’s appreciation for the efforts of John Robson,
Sondra Nelson, Dong Chao, Jan Jerrell, and especially Amy Harshbarger and Margaret Ying.
In addition, Keith Hoover and Raymond Bland are thanked for rescuing the author from
the perils of many hardware problems.
    The last, and surely the most important, note of thanks belongs once again to the ever-
patient and encouraging now-retired secretary of the Rose-Hulman mathematics depart-
ment   — Mrs. Mary Lou McCullough. Thank you for the fifth time, Mary Lou, for all of
your work!
    Alas, the remaining errors, ambiguities, and misleading comments are once again the
sole responsibility of the author.

RPG.
                                                                          Terre Haute, Indiana
           Contents

PART 1
Fundamentals of Discrete Mathematics           1

Fundamental Principles of Counting             3
                    1.]   The Rules of Sum and Product    3
                    1.2   Permutations   6
                    1.3   Combinations: The Binomial Theorem        14
                    1.4   Combinations with Repetition   26
                    1.5   The Catalan Numbers (Optional)    36
                    1.6   Summary and Historical Review     41

Fundamentals of Logic        47
                    2.1   Basic Connectives and Truth Tables     47
                    2.2   Logical Equivalence: The Laws of Logic      55
                    2.3   Logical Implication: Rules of Inference   67
                    2.4   The Use of Quantifiers    86
                    2.5   Quantifiers, Definitions, and the Proofs of Theorems         103
                    2.6   Summary and Historical Review       117

Set Theory     123
                    3.1   Sets and Subsets    123
                    3.2   Set Operations and the Laws of Set Theory    136
                    3.3   Counting and Venn Diagrams      148
                    3.4   A First Word on Probability   150
                    3.5   The Axioms of Probability (Optional)    157
                    3.6   Conditional Probability: Independence (Optional)       166
                    3.7   Discrete Random Variables (Optional)     175
                    3.8   Summary and Historical Review       186

XV
xvi         Contents

4     Properties of the Integers: Mathematical Induction                         193
                          4.]   The Well-Ordering Principle: Mathematical Induction  193
                          4.2   Recursive Definitions  210
                          4.3   The Division Algorithm: Prime Numbers     221
                          4.4   The Greatest Common Divisor: The Euclidean Algorithm     231
                          4.5   The Fundamental Theorem of Arithmetic     237
                          4.6   Summary and Historical Review     242

5     Relations and Functions          247
                          5.1   Cartesian Products and Relations   248
                          5.2   Functions: Plain and One-to-One    252
                          5.3   Onto Functions: Stirling Numbers of the Second Kind                  260
                          5.4   Special Functions   267
                          5.5   The Pigeonhole Principle    273
                          5.6   Function Composition and Inverse Functions    278
                          5.7   Computational Complexity      289
                          5.8   Analysis of Algorithms    294
                          5.9   Summary and Historical Review     302

6     Languages: Finite State Machines               309
                          6.1   Language: The Set Theory of Strings          309
                          6.2   Finite State Machines: A First Encounter  319
                          6.3   Finite State Machines: A Second Encounter     326
                          6.4   Summary and Historical Review       332

7     Relations: The Second Time Around                  337
                          7.1   Relations Revisited: Properties of Relations             337
                          7.2   Computer Recognition: Zero-One Matrices and Directed Graphs                      344
                          7.3   Partial Orders: Hasse Diagrams         356
                          7.4   Equivalence Relations and Partitions         366
                          75    Finite State Machines: The Minimization Process                371
                          7.6   Summary and Historical Review     376

PART 2
      Further Topics in Enumeration        383

8     The Principle of Inclusion and Exclusion                   385
                          8.1   The Principle of Inclusion and Exclusion           385
                          8.2   Generalizations of the Principle  397
                          8.3   Derangements: Nothing Is in Its Right Place   402
                          8.4   Rook Polynomials      404
                          8.5   Arrangements with Forbidden Positions     406
                          8.6   Summary and Historical Review      411
                                                                                     Contents   xvii

9      Generating Functions       415
                 9.1    Introductory Examples    415
                 9.2    Definition and Examples: Calculational Techniques     418
                 9.3    Partitions of Integers 432
                 9.4    The Exponential Generating Function       436
                 9.5    The Summation Operator    440
                 9.6    Summary and Historical Review      442

Recurrence Relations       447
                 10.1   The First-Order Linear Recurrence Relation   447
                 10.2   The Second-Order Linear Homogeneous Recurrence Relation with Constant
                        Coefficients   456
                 10.3   The Nonhomogeneous Recurrence Relation       470
                 10.4   The Method of Generating Functions    482
                 10.5   A Special Kind of Nonlinear Recurrence Relation (Optional) 487
                 10.6   Divide-and-Conquer Algorithms (Optional)    496
                 10.6   Summary and Historical Review     505

PART 3
Graph Theory and Applications         511

ll     An Introduction to Graph Theory           513
                 11.1   Definitions and Examples    513
                 11.2   Subgraphs, Complements, and Graph Isomorphism        520
                 11.3   Vertex Degree: Euler Trails and Circuits 530
                 11.4   Planar Graphs    540
                 11.5   Hamilton Paths and Cycles     556
                 11.6   Graph Coloring and Chromatic Polynomials     564
                 11.7   Summary and Historical Review       573

12     Trees   581
                 12.1   Definitions, Properties, and Examples     581
                 12.2   Rooted Trees    587
                 12.3   Trees and Sorting   605
                 12.4   Weighted Trees and Prefix Codes     609
                 12.5   Biconnected Components and Articulation Points      615
                 12.6   Summary and Historical Review   622

13     Optimization and Matching         631
                 13.1   Dijkstra’s Shortest-Path Algorithm 63]
                 13.2   Minimal Spanning Trees: The Algorithms of Kruskal and Prim     638
                 13.3   Transport Networks: The Max-Flow Min-Cut Theorem      644
                 13.4   Matching Theory     659
                 13.5   Summary and Historical Review      667
   Contents

PART 4
Modern Applied Algebra          671

14      Rings and Modular Arithmetic          673
                 14.1    The Ring Structure: Definition and Examples     673
                 14.2    Ring Properties and Substructures   679
                 14.3    The Integers Modulon      686
                 14.4    Ring Homomorphisms and Isomorphisms         697
                 14.5    Summary and Historical Review       705

15      Boolean Algebra and Switching Functions                    711
                 15.1    Switching Functions: Disjunctive and Conjunctive Normal Forms                           711
                 15.2    Gating Networks: Minimal Sums of Products: Karnaugh Maps                          719
                 15.3    Further Applications: Don’t-Care Conditions   729
                 15.4    The Structure of a Boolean Algebra (Optional)   733
                 15.5    Summary and Historical Review     742

Groups, Coding Theory, and Polya’s Method of Enumeration                                      745
                 16.]     Definition, Examples, and Elementary Properties               745
                 16.2     Homomorphisms, Isomorphisms, and Cyclic Groups                       752
                 16.3     Cosets and Lagrange’s Theorem 757
                 16.4     The RSA Cryptosystem (Optional)    759
                 16.5     Elements of Coding Theory    761
                 16.6     The Hamming Metric     766
                 16.7     The Parity-Check and Generator Matrices           769
                 16.8     Group Codes: Decoding with Coset Leaders            773
                 16.9     Hamming Matrices      777
                 16.10    Counting and Equivalence: Burnside’s Theorem                  779
                 16.1]    The Cycle Index   785
                 16.12    The Pattern Inventory: Polya’s Method of Enumeration                       789
                 16.13    Summary and Historical Review        794

7       Finite Fields and Combinatorial Designs                  799
                 17.]    Polynomial Rings     799
                 17.2    Irreducible Polynomials: Finite Fields    806
                 17.3    Latin Squares    815
                 17.4    Finite Geometries and Affine Planes     820
                 17.5    Block Designs and Projective Planes      825
                 17.6    Summary and Historical Review       830

Appendix1      ‘Exponential and Logarithmic Functions       A-1

Appendix 2     Matrices, Matrix Operations, and Determinants                 A-11

Appendix 3     Countable and Uncountable Sets       A-23
                    Contents   xix

Solutions     S-1

Index = I-1
     PART

FUNDAMENTALS
  OF DISCRETE
MATHEMATICS
Fundamental
Principles of
   Counting

numeration, or counting, may strike one as an obvious process that a student learns
                   when first studying arithmetic. But then, it seems, very little attention is paid to further
               development in counting as the student turns to “more difficult” areas in mathematics, such
               as algebra, geometry, trigonometry, and calculus. Consequently, this first chapter should
               provide some warning about the seriousness and difficulty of “mere” counting.
                   Enumeration does not end with arithmetic. It also has applications in such areas as coding
               theory, probability and statistics, and in the analysis of algorithms. Later chapters will offer
               some specific examples of these applications.
                   As we enter this fascinating field of mathematics, we shall come upon many problems that
               are very simple to state but somewhat “sticky” to solve. Thus, be sure to learn and understand
               the basic formulas — but do nor rely on them too heavily. For without an analysis of each
               problem, a mere knowledge of formulas is next to useless. Instead, welcome the challenge
               to solve unusual problems or those that are different from problems you have encountered
               in the past. Seek solutions based on your own scrutiny, regardless of whether it reproduces
               what the author provides. There are often several ways to solve a given problem.

1.1
The Rules of Sum and Product
               Our study of discrete and combinatorial mathematics begins with two basic principles of
               counting: the rules of sum and product. The statements and initial applications of these
              rules appear quite simple. In analyzing more complicated problems, one is often able to
               break down such problems into parts that can be solved using these basic principles. We
               want to develop the ability to “decompose” such problems and piece together our partial
               solutions in order to arrive at the final answer. A good way to do this is to analyze and solve
               many diverse enumeration problems, taking note of the principles being used. This is the
               approach we shall follow here.
                  Our first principle of counting can be stated as follows:

: Rule    ‘Sum: If a first task can be performed in m ways, while a second task can
                      et   ed in n ways, and the two tasks cannot be performed simultaneously, then
                           ‘gither task can be accomplished in any one of m + n ways.
4         Chapter 1 Fundamental Principles of Counting

Note that when we say that a particular occurrence, such as a first task, can come about in m
                           ways, these m ways are assumed to be distinct, unless a statement is made to the contrary.
                           This will be true throughout the entire text.

Acollege library has 40 textbooks on sociology and 50 textbooks dealing with anthropology.
    EXAMPLE 1.1
                          By the rule of sum, a student at this college can select among 40 + 50 = 90 textbooks in
                          order to learn more about one or the other of these two subjects.

The rule can be extended beyond two tasks as long as no pair of tasks can occur simultane-
    EXAMPLE 1.2
                           ously. For instance, a computer science instructor who has, say, seven different introductory
                           books each on C++, Java, and Perl can recommend any one of these 21 books to a student
                           who is interested in learning a first programming language.

The computer science instructor of Example 1.2 has two colleagues. One of these col-
    EXAMPLE 1.3
                           leagues has three textbooks on the analysis of algorithms, and the other has five such
                           textbooks. If n denotes the maximum number of different books on this topic that this
                           instructor can borrow from them, then 5 < n < 8, for here both colleagues may own copies
                           of the same textbook(s).

The following example introduces our second principle of counting.

Intrying to reach a decision on plant expansion, an administrator assigns 12 of her employees
    EXAMPLE 1.4
                          to two committees. Committee A consists of five members and is to investigate possible
                          favorable results from such an expansion. The other seven employees, committee B, will
                          scrutinize possible unfavorable repercussions. Should the administrator decide to speak to
                          just one committee member before making her decision, then by the rule of sum there are
                           12 employees she can call upon for input. However, to be a bit more unbiased, she decides
                          to speak with a member of committee A on Monday, and then with a member of committee
                          B on Tuesday, before reaching a decision. Using the following principle, we find that she
                          can select two such employees to speak with in 5 X 7 = 35 ways.

The Rule of Product: If a procedure can be broken down into first and second stages,
                             and if there are m possible outcomes for the first stage and if, for each of these outcomes,
                             there are n possible outcomes for the second stage, then the total procedure can be carried
                             out, in the designated order, in mn ways. -

The drama club of Central University is holding tryouts for a spring play. With six men and
    EXAMPLE 1.5
                          eight women auditioning for the leading male and female roles, by the rule of product the
                          director can cast his leading couple in 6 X 8 = 48 ways.

Here various extensions of the rule are illustrated by considering the manufacture of license
    EXAMPLE 1.6
                          plates consisting of two letters followed by four digits.
                                                                             1.1     The Rules of Sum and Product             5

a) If no letter or digit can be repeated,                   there      are   26X25X10X9X8X7=
                   3,276,000 different possible plates.
                b) With repetitions of letters and digits allowed,                      26 X 26 xX 10x 10x          10x   10 =
                   6,760,000 different license plates are possible.
                c) If repetitions are allowed, as in part (b), how many of the plates have only vowels (A,
                    E, I, O, U) and even digits? (0 is an even integer.)

In order to store data, acomputer’s main memory contains a large collection of circuits, each
EXAMPLE 1.7
              of which is capable of storing a bit — that is, one of the binary digits 0 or 1. These storage
              circuits are arranged in units called (memory) cells. To identify the cells in a computer’s
              main memory, each is assigned a unique name called its address. For some computers,
              such as embedded microcontrollers (as found in the ignition system for an automobile), an
              address is represented by an ordered list of eight bits, collectively referred to as a byte. Using
              the rule of product, there are 2X 2X2*2X2xX*2X2X2 = 2° = 256 such bytes. So
              we have 256 addresses that may be used for cells where certain information may be stored.
                  A kitchen appliance, such as a microwave oven, incorporates an embedded microcon-
              troller. These “small computers” (such as the PICmicro microcontroller) contain thousands
              of memory cells and use two-byte addresses to identify these cells in their main memory.
              Such addresses are made up of two consecutive bytes, or 16 consecutive bits. Thus there
              are 256 X 256 = 28 x 28 = 2!© = 65,536 available addresses that could be used to iden-
              tify cells in the main memory. Other computers use addressing systems of four bytes. This
              32-bit architecture is presently used in the Pentium’ processor, where there are as many
              as 28 x 28 x 28 x 28 = 272 = 4,294,967,296 addresses for use in identifying the cells in
              main memory. When a programmer deals with the UltraSPARC? or Itanium’ processors, he
              or she considers memory cells with eight-byte addresses. Each of these addresses comprises
              8 X 8 = 64 bits, and there are 2% = 18,446,744,073,709,551,616 possible addresses for
              this architecture. (Of course, not all of these possibilities are actually used.)

At times it is necessary to combine several different counting principles in the solution of
EXAMPLE 1.8
              one problem. Here we find that the rules of both sum and product are needed to attain the
              answer.
                 At the AWL corporation Mrs. Foster operates the Quick Snack Coffee Shop. The menu
              at her shop is limited: six kinds of muffins, eight kinds of sandwiches, and five beverages
              (hot coffee, hot tea, iced tea, cola, and orange juice). Ms. Dodd, an editor at AWL,                        sends
              her assistant Car] to the shop to get her lunch— either a muffin and a hot beverage or a
              sandwich and a cold beverage.
                  By the rule of product, there are 6 X 2 = 12 ways in which Carl can purchase a muffin and
              hot beverage. A second application of this rule shows that there are 8 X 3 = 24 possibilities
              for a sandwich and cold beverage. So by the rule of sum, there are 12 + 24 = 36 ways in
              which Carl can purchase Ms. Dodd’s lunch.

* Pentium (R) is a registered trademark of the Intel Corporation.
                 *The UltraSPARC processor is manufactured by Sun (R) Microsystems, Inc.
                 STtanium (TM) is a trademark of the Intel Corporation.
6          Chapter 1 Fundamental Principles of Counting

1.2
                     Permutations

Continuing to examine applications of the rule of product, we turn now to counting linear
                            arrangements of objects. These arrangements are often called permutations when the objects
                            are distinct. We shall develop some systematic methods for dealing with linear arrangements,
                            starting with a typical example.

In a class of 10 students, five are to be chosen and seated in a row for a picture. How many
    EXAMPLE 1.9
                            such linear arrangements are possible?
                               The key word here is arrangement, which designates the importance of order. If A, B,
                            C,...,1, J denote the 10 students, then BCEFI, CEFIB, and ABCFG are three such different
                            arrangements, even though the first two involve the same five students.
                               To answer this question, we consider the positions and possible numbers of students we
                            can choose from in order to fill each position. The filling of a position is a stage of our
                            procedure.

10        x       9      x       8      x       7            x         6

Ist             2nd            3rd            4th                   5th
                                               position         position       position       position              Position

Each of the 10 students can occupy the Ist position in the row. Because repetitions are
                            not possible here, we can select only one of the nine remaining                         students to fill the 2nd
                            position. Continuing in this way, we find only six students to select from in order to fill the
                            5th and final position. This yields a total of 30,240 possible arrangements of five students
                            selected from the class of 10.
                                Exactly the same answer is obtained if the positions are filled from right to left—
                            namely, 6 X 7 X 8 xX 9 X 10. If the 3rd position is filled first, the 1st position second, the
                            4th position third, the 5th position fourth, and the 2nd position fifth, then the answer is
                            9X    6X   10 X 8 X 7, still the same value, 30,240.

As in Example 1.9, the product of certain consecutive positive integers often comes
                            into play in enumeration problems. Consequently, the following notation proves to be quite
                            useful when we are dealing with such counting problems. It will frequently allow us to
                            express our answers in a more convenient form.

Definition 1.1          For an integer n > 0, n factorial (denoted n!) is defined by

0! = 1,
                                                  n! = (n)(n— 1)(n — 2)--- (3)(2)C),_                    for       n>.

One finds that 1! = 1, 2! = 2,3! = 6, 4! = 24, and 5! = 120. In addition, for each n > 0,
                            (n+1)!=      (n+ 1)(n}).

Before we proceed any further, let us try to get a somewhat better appreciation for how
                            fast n! grows. We can calculate that 10! = 3,628,800, and it just so happens that this is
                            exactly the number of seconds in six weeks. Consequently, 11! exceeds the number of
                            seconds in one year, 12! exceeds the number in 12 years, and 13! surpasses the number of
                            seconds in a century.
                                                                                               1.2   Permutations        7

If we make use of the factorial notation, the answer in Example 1.9 can be expressed in
                  the following more compact form:

10X9X8X7X6=10X9X8X7X                                 6
                                                                                      5X4X3X2X1_
                                                                                    X ——_—_
                                                                                          = _..
                                                                                               10!
                                  9x8                 6          7         7          5xX4xX3x*2x1                  5!

Definition 1.2   Given a collection of n distinct objects, any (linear) arrangement of these objects is called
                  a permutation of the collection.

Starting with the letters a, b, c, there are six ways to arrange, or permute, all of the letters:
                  abc, acb, bac, bea, cab, cba. If we are interested in arranging only two of the letters at a
                  time, there are six such size-2 permutations: ab, ba, ac, ca, be, cb.

If there are n distinct objects and r is an integer, with 1 <r <n, then by the mule of
                    product, the number of permutations of size r for the n objects is
                    Paa,r)=         n        X   (a~-1)X(H—-2)X%---X@—r4))
                                    Ist            and       3rd                      rth
                                 position        position   —_position              pesition
                                ove te ome                                       (n — n(n =r — 1) ---(3)(Q))
                              = OO DO)    Ot                                   X Ora)ODOM
                                        n!
                              “(ant
                                                                                                              %

Forr = 0, P(n, 0) = 1 =n!/(n —0)!, so P(n, r) = n!/(n — r)! holds for allO <r <n.
                  A special case of this result is Example 1.9, where n = 10, r = 5, and P(10, 5) = 30,240.
                  When permuting all of the n objects in the collection, we have r = n and find that P(n, n) =
                  ni/O! = nl.
                      Note, for example, that if n > 2, then P(n, 2) = n!/(n — 2)! =n(n — 1). When n > 3
                  one finds that P(n, n — 3) = n!l/[n — (n — 3)]! = n!/3! = (2)(m — 1)(n — 2) -- - (5)(A).

The number of permutations of size r, where 0 <r <n, from a collection of n objects,
                  is P(n, r) =n!/(n —r)!. (Remember that P(n, r) counts (linear) arrangements in which
                  the objects cannot be repeated.) However, if repetitions are allowed, then by the rule of
                  product there are n’” possible arrangements, with r > 0.

EXAMPLE    1.10   The number of permutations of the letters in the word COMPUTRR is 8!. If only five of the
            :     letters are used, the number of permutations (of size 5) is P(8, 5) = 8!/(8 — 5)! = 81/3! =
                  6720. If repetitions of letters are allowed, the number of possible 12-letter sequences is
                  8! = 6.872 x 101°

EXAMPLE    1.11   Unlike Example 1.10, the number of (linear) arrangements of the four letters in BALL is
            :     12, not 4! (= 24), The reason is that we do not have four distinct letters to arrange. To get
                  the 12 arrangements, we can list them as in Table 1.1 (a).

The symbo] “=” is read “is approximately equal to.”
      Chapter 1 Fundamental Principles of Counting

Table 1.1

A       B  L  L           A      B     lk    kL          A     B      bk   |&
                                    A       L  BL             A      L,    B   IL            A     lo.    B     L
                                    A       L  L  B           A      lL,   leo    B          A     lb     L,    B
                                    B       A  L  L           B      A     Ll,    Ib         B     A     lb     Ly
                                    B       L  A  L           B      L,    A   IL            B      lb    A   Ll
                                    B       L  L  A           B      lk    Lb    A           B     lb     lL,  A
                                    L       A  BL             L      A     B   tL            Ir    A      B     |
                                    L       A  L  B           L;     A     I.     B          L,    A      LL,   B
                                    L       BA    L           L      B     A      IL,        Ir    B      A     |
                                    L       BL    A           L      B     lL    A           Ir    B      L,   A
                                    L       L  A  B           L;     Ie,   AB                I,    L,     AB
                                    L       LBA               L      bb    B     A           lI.   L,     B    A

(b)
                                   aS   —

If the two L’s are distinguished as L;, L2, then we can use our previous ideas on per-
                       mutations of distinct objects; with the four distinct symbols B, A, L;, Lz, we have 4! = 24
                       permutations. These are listed in Table 1.1(b). Table 1.1 reveals that for each arrangement
                       in which the L’s are indistinguishable there corresponds a pair of permutations with distinct
                       L’s. Consequently,

2 X (Number of arrangements of the letters B, A, L, L)
                                                              = (Number of permutations of the symbols B, A, L;, L2),

and the answer to the original problem of finding all the arrangements of the four letters in
                       BALL is 4!/2 = 12.

Using the idea developed in Example 1.11, we now consider the arrangements of all nine
EXAMPLE 1.12
                       letters in DATABASES.
                           There are 3! = 6 arrangements with the A’s distinguished for each arrangement in
                       which   the A’s      are not distinguished.    For example,      DA;TA,BA3;SES,    DA, TA;BA)SES,
                       DAgTA, BA3SES, DA,TA3BA,SES,                 DA3TA; BAoSES, and DA3;TA2BA,) SES all correspond
                       to DATABASES, when we remove the subscripts on the A’s. In addition, to the arrange-
                       ment DA; TA2BA3SES there corresponds the pair of permutations DA; TA2BA3S,;ES> and
                       DA, TA2BA3S2ES;,          when the S’s are distinguished. Consequently,

(2!)(3!)(Number of arrangements of the letters in DATABASES)
                                            = (Number of permutations of the symbols D, A, T, Ao, B, A3, S;, E, S2),

so the number of arrangements of the nine letters in DATABASES                is 9!/(2! 3!) = 30,240.

Before stating a general principle for arrangements with repeated symbols, note that in our
                       prior two examples we solved a new type of problem by relating it to previous enumeration
                       principles. This practice is common in mathematics in general, and often occurs in the
                       derivations of discrete and combinatorial formulas.
                                                                                                                           1.2 Permutations                    9

If there are 1 objects with n, indistinguishable objects of a first type, nz indistinguishable
                 objects of a second type, ..., and n, indistinguishable objects of an rth type, where
                                                                   !
                 fy +g       +--+,            = n, then there are             eR                              (linear) arrangements of the given
                                                                              jiftaie+ +:           Ay!
                n objects.                                                                      ’

The MASSASAUGA is a brown and white venomous snake indigenous to North America.
EXAMPLE 1.13
               Arranging all of the letters in MASSASAUGA, we find that there are
                                                                        10!
                                                            sun
               possible arrangements. Among these are
                                                                        7!
                                                                    ——_—— = 840
                                                                    311 ds di id!
               in which all four A’s are together. To get this last result, we considered all arrangements of
               the seven symbols AAAA (one symbol), S, S, S, M, U, G.

Determine the number of (staircase) paths in the x y-plane from (2, 1) to (7, 4), where each
EXAMPLE 1.14
               such path is made up of individual steps going one unit to the right (R) or one unit upward
               (U). The blue lines in Fig. 1.1 show two of these paths.

y                                                               y
                   4     t——                  _—                    7                4      4

3          )                                                      3 |-
                   2 -—-4                                   |                        2                                                             —
                    if}                                         |                                                  |
                                                                                                                                               |
                                               i                         a                                         j        |                  |           x
                             1    2       3        4    5   6           7                                 1       2        3     4       5    6        7
                   (a)                R,U,R,R,U,R,R,U                                 (b)                              U,R,R,R,U,U,R,R
                Figure 1.1

Beneath each path in Fig. 1.1 we have listed the individual steps. For example, in part
               (a) the list R, U, R, R, U, R, R, U indicates that starting at the point (2, 1), we first move
               one unit to the right [to (3, 1)], then one unit upward [to (3, 2)], followed by two units to
               the right [to (S, 2)], and so on, until we reach the point (7, 4). The path consists of five R’s
               for moves to the right and three U’s for moves upward.
                   The path in part (b) of the figure is also made up of five R’s and three U’s. In general,
               the overall trip from (2, 1) to (7, 4) requires 7 — 2 = 5 horizontal moves to the right and
               4 — ] =3 vertical moves upward. Consequently, each path corresponds to a list of five
               R’s and three U’s, and the solution for the number of paths emerges as the number of
               arrangements of the five R’s and three U’s, which is 8!/(5! 3!) = 56.
10         Chapter 1 Fundamental Principles of Counting

We now do something a bit more abstract and prove that if n and k are positive integers with
     EXAMPLE 1.15           n = 2k, then n!/2* is an integer. Because our argument relies on counting, it is an example
                            of a combinatorial proof.
                               Consider the u symbols x1, x), X2, X2, ..., X%, Xx. The number of ways in which we can
                            arrange all of these n = 2k symbols is an integer that equals
                                                                          n!               n}
                                                                      Q2)--.21ee!
                                                                      —
                                                                                           Qk
                                                                      k factors
                                                                             of 2!

Finally, we will apply what has been developed so far to a situation in which the arrange-
                            ments are no longer linear.

EXAMPLE 1.16 |         If six people, designated as A, B,..., F, are seated about a round table, how many different
                            circular arrangements are possible, if arrangements are considered the same when one can
                            be obtained from the other by rotation? [In Fig. 1.2, arrangements (a) and (b) are considered
                            identical, whereas (b), (c), and (d) are three distinct arrangements.]

A                         C                             A                 D
                                D                B            F            D          B             D    E                A

Cc                E        E                A          E             C    F                C
                                        F                         B                             F                 B
                              (a)                         (b)                        (c)                 (d)
                           Figure 1.2

We shall try to relate this problem to previous ones we have already encountered. Con-
                            sider Figs. 1.2(a) and (b). Starting at the top of the circle and moving clockwise, we list
                            the distinct linear arrangements ABEFCD and CDABEF, which correspond to the same
                            circular arrangement. In addition to these two, four other linear arrangements — BEFCDA,
                            DABEFC,         EFCDAB,   and FCDABE
                                                              — are             found to correspond to the same   circular ar-
                            rangement as in (a) or (b). So inasmuch as each circular arrangement corresponds to six
                            linear arrangements, we have 6 X (Number of circular arrangements            of A, B,..., F) =
                            (Number of linear arrangements of A, B,..., F) = 6!.
                                Consequently, there are 6!/6 = 5! = 120 arrangements of A, B,. . ., Faround the circular
                            table.

Suppose now that the six people of Example 1.16 are three married couples and that A, B,
     EXAMPLE 1.17
                            and C are the females. We want to arrange the six people around the table so that the sexes
                            alternate. (Once again, arrangements are considered identical if one can be obtained from
                            the other by rotation.)
                                Before we solve this problem, let us solve Example 1.16 by an alternative method,
                            which will assist us in solving our present problem. If we place A at the table as shown in
                            Fig. 1.3(a), five locations (clockwise from A) remain to be filled. Using B, C,..., F to fill
                                                                                                                                    1.2   Permutations               11

A

5                      1             M3                        M1

4                      2             F3                         F2
                                                                                                                      M2

(a)                                  (b)
                                                                 Figure 1.3

these five positions is the problem of permuting B, C, ..., F in a linear manner, and this
                               can be done in 5! = 120 ways.
                                   To    solve      the new       problem       of alternating        the        sexes,       consider      the method    shown      in
                               Fig. 1.3(b). A (a female) is placed as before. The next position, clockwise from A, is marked
                               M1 (Male 1) and can be filled in three ways. Continuing clockwise from A, position F2
                               (Female 2) can be filled in two ways. Proceeding in this manner, by the rule of product,
                               there are 3 X 2 X 2 X 1 X | = 12 ways in which these six people can be arranged with no
                               two men or women seated next to each other.

ing on the slate? (iii) at least one physician appearing on
                   EXERCISES 1.1 AND 1.2                                              the slate?
                                                                                    5. While on a Saturday shopping spree Jennifer and Tiffany
  1. During a local campaign, eight Republican and five Demo-
                                                                                  witnessed two men driving away from the front of a jewelry
cratic candidates are nominated for president of the school
                                                                                  shop, just before a burglar alarm started to sound. Although ev-
board.
                                                                                  erything happened rather quickly, when the two young ladies
    a) If the president is to be one of these candidates, how                     were questioned they were able to give the police the following
    many possibilities are there for the eventual winner?                         information about the license plate (which consisted of two let-
    b) How many possibilities exist for a pair of candidates                      ters followed by four digits) on the get-away car. Tiffany was
    (one from each party) to oppose each other for the eventual                   sure that the second letter on the plate was either an O or a Q and
    election?                                                                     the last digit was either a 3 or an 8. Jennifer told the investigator
    c) Which    counting   principle    is   used    in   part     (a)?    in     that the first letter on the plate was either a C or a G and that the
    part (b)?                                                                     first digit was definitely a 7. How many different license plates
                                                                                  will the police have to check out?
2. Answer part (c) of Example 1.6.
                                                                                    6. To raise money for a new municipal pool, the chamber of
3. Buick automobiles come in four models, 12 colors, three                       commerce in a certain city sponsors arace. Each participant pays
engine sizes, and two transmission types. (a) How many distinct                   a $5 entrance fee and has a chance to win one of the different-
Buicks can be manufactured? (b) If one of the available colors                    sized trophies that are to be awarded to the first eight runners
is blue, how many different blue Buicks can be manufactured?                      who finish.
  4, The board of directors of a pharmaceutical corporation has                       a) If 30 people enter the race, in how many ways will it be
10 members. An upcoming stockholders’ meeting is scheduled                            possible to award the trophies?
to approve a new slate of company officers (chosen from the 10                        b) If Roberta and Candice are two participants in the race,
board members).                                                                       in how many ways can the trophies be awarded with these
    a) How many different slates consisting of a president, vice                      two runners among the top three?
    president, secretary, and treasurer can the board present to                    7. Acertain “Burger Joint” advertises that a customer can have
    the stockholders for their approval?                                          his or her hamburger with or without any or all of the fol-
    b) Three members of the board of directors are physicians.                    lowing:       catsup,        mustard,       mayonnaise,    lettuce, tomato,   onion,
    How many slates from part (a) have (i) a physician nomi-                      pickle, cheese,          or mushrooms.           How      many   different kinds    of
    nated for the presidency? (ii) exactly one physician appear-                  hamburger orders are possible?
12            Chapter 1 Fundamental Principles of Counting

8. Matthew works as a computer operator at a small univer-              b) How many different round trips can Linda travel from
sity. One evening he finds that 12 computer programs have been            town A to town C and back to town A?
submitted earlier that day for batch processing. In how many              ¢) How many of the round trips in part (b) are such that
ways can Matthew order the processing of these programs if                the return trip (from town C to town A) is at least partially
(a) there are no restrictions? (b) he considers four of the pro-          different from the route Linda takes from town A to town
grams higher in priority than the other eight and wants to process        C? (For example, if Linda travels from town A to town C
those four first? (c) he first separates the programs into four of        along roads R, and Rg, then on her return she might take
top priority, five of lesser priority, and three of least priority,       roads Rg and R3, or roads R7      and Ro, or road Ro, among
and he wishes to process the 12 programs in such a way that the           other possibilities, but she does not travel on roads Rg
top-priority programs are processed first and the three programs          and R;.)
of least priority are processed last?
                                                                      12. List all the permutations for the letters a, c, t.
9. Patter’s Pastry Parlor offers eight different kinds of pastry     13. a) How many permutations are there for the eight letters
and six different kinds of muffins. In addition to bakery items
                                                                          a, c, f, g, 1, t, w, x?
one can purchase small, medium, or large containers of the fol-
lowing beverages: coffee (black, with cream, with sugar, or with          b) Consider the permutations in part (a). How many start
cream and sugar), tea (plain, with cream, with sugar, with cream          with the letter t? How many start with the letter t and end
and sugar, with lemon, or with lemon and sugar), hot cocoa, and           with the letter c?
orange juice. When Carol comes to Patter’s, in how many ways          14, Evaluate each of the following.
can she order                                                             a) P(7,2)           b) P(8,4)    ce) P(10,7)     d) P(12, 3)
     a) one bakery item and one medium-sized beverage for             15. In how many ways can the symbols a, b, c, d, e, e, e, e, €
     herself?                                                         be arranged so that no e is adjacent to another e?
     b) one bakery item and one container of coffee for herself       16. An alphabet of 40 symbols is used for transmitting messages
     and one muffin and one container of tea for her boss, Ms.        in acommunication system. How many distinct messages (lists
     Didio?                                                           of symbols) of 25 symbols can the transmitter generate if sym-
     c) one piece of pastry and one container of tea for herself,     bols can be repeated in the message? How many if 10 of the
     one muffin and a container of orange juice for Ms. Didio,        40 symbols can appear only as the first and/or last symbols of
     and one bakery item and one container of coffee for each         the message, the other 30 symbols can appear anywhere, and
     of her two assistants, Mr. Talbot and Mrs. Gillis?               repetitions of all symbols are allowed?
10, Pamela has 15 different books. In how many ways can she           17. In the Internet each network interface of a computer is as-
place her books on two shelves so that there is at least one book     signed one, or more, Internet addresses. The nature of these
on each shelf? (Consider the books in each arrangement to be          Internet addresses is dependent on network size. For the In-
stacked one next to the other, with the first book on each shelf      ternet Standard regarding reserved network numbers (STD 2),
at the left of the shelf.)                                            each address is a 32-bit string which falls into one of the fol-
                                                                      lowing three classes: (1) A class A address, used for the largest
11. Three small towns, designated by A, B, and C, are inter-
                                                                      networks, begins with a 0 which is then followed by a seven-bit
connected by a system of two-way roads, as shown in Fig. 1.4.
                                                                      network number, and then a 24-bit local address. However, one
                                                                      is restricted from using the network numbers of all 0’s or all
                                                                      1’s and the local addresses of all 0’s or all 1’s. (2) The class
                                                                      B address is meant for an intermediate-sized network. This ad-
                                                                      dress starts with the two-bit string 10, which is followed by a
                                                                       14-bit network number and then a 16-bit local address. But the
                                                                      local addresses of all 0’s or all 1’s are not permitted. (3) Class C
                                                                      addresses are used for the smallest networks. These addresses
                                                                      consist of the three-bit string 110, followed by a 21-bit network
                                                                      number, and then an eight-bit local address. Once again the local
                                                                      addresses of all 0’s or all 1’s are excluded. How many different
                                                                      addresses of each class are available on the Internet, for this
                                                                      Internet Standard?
             Figure 1.4                                               18. Morgan is considering the purchase of a low-end computer
                                                                      system. After some careful investigating, she finds that there are
     a) In how many ways can Linda travel from town A to              seven basic systems (each consisting of a monitor, CPU, key-
     town C?                                                          board, and mouse) that meet her requirements. Furthermore, she
                                                                                                                   1.2 Permutations                     13

also plans to buy one of four modems, one of three CD ROM            24. Show that for all integers n, r > 0, ifn +1 > r, then
drives, and one of six printers. (Here each peripheral device of                                                  n+1
a given type, such as the modem,    is compatible with all seven                   Pnvin=(*1.)                                        P(n,r).

basic systems.) In how many ways can Morgan configure her
                                                                     25, Find the value(s) of n in each of the following:
low-end computer system?
                                                                     (a) P(n, 2) = 90, (b) P(n, 3) = 3P(n, 2), and
19, Acomputer science professor has seven different program-         (c) 2P(n, 2) +50 = P(2n, 2).
ming books on a bookshelf. Three of the books deal with C++,
the other four with Java. In how many ways can the professor         26. How many different paths in the xy-plane are there from
arrange these books on the shelf (a) if there are no restrictions?   (0, 0) to (7, 7) if a path proceeds one step at a time by go-
(b) if the languages should alternate? (c) if all the C++ books      ing either one space to the right (R) or one space upward (U)?
must be next to each other? (d) if all the C++ books must be         How many such paths are there from (2, 7) to (9, 14)? Can any
next to each other and all the Java books must be next to each       general statement be made that incorporates these two results?
other?
                                                                     27. a) How many distinct paths are there from (—1, 2, 0) to
20. Over the Internet, data are transmitted in structured blocks
                                                                           (1, 3, 7) in Euclidean         three-space if each move               is one of
of bits called datagrams.
                                                                           the following types?
    a) In how many ways can the letters in DATAGRAM             be
    arranged?                                                                           (H): (x, y, 2) > & +1, y, 2);
    b) For the arrangements of part (a), how many have all                              (V): (x, y, 2) > (x, y + 1, 2);
    three A’s together?                                                                 (A): (x, y, 2) > (XZ +1)
21. a) How many arrangements are there of all the letters in               b) How many        such        paths        are    there     from (1, 0,5)   to
    SOCIOLOGICAL?                                                          (8, 1, 7)?
    b) In how many of the arrangements in part (a) are A and               c) Generalize the results in parts (a) and (b).
    G adjacent?
                                                                     28.   a) Determine the value of the integer variable counter af-
    c) In how many of the arrangements in part (a) are all the             ter execution of the following program segment. (Here /,
    vowels adjacent?                                                       j, and & are integer variables.)
22. How many positive integers n can we form using the digits
                                                                                        counter             :=0
3, 4, 4, 5, 5, 6, 7 if we want n to exceed 5,000,000?
                                                                                        fori :=1tol12                        do
23. Twelve clay targets (identical in shape) are arranged in four                           counter               := counter+1
hanging columns, as shown in Fig. 1.5. There are four red tar-                          forj :=5to1l10do
gets in the first column, three white ones in the second column,                            counter               :=     counter           + 2
two green targets in the third column, and three blue ones in                           for    k     :=     15 downto               8 do
the fourth column. To join her college drill team, Deborah must                             counter               :=     counter           + 3
break all 12 of these targets (using her pistol and only 12 bul-
lets) and in so doing must always break the existing target at             b) Which counting principle is at play in part (a)?
the bottom of a column. Under these conditions, in how many
                                                                     29. Consider the following program segment where i, j, and k
different orders can Deborah shoot down (and break) the 12
                                                                     are integer variables.
targets?
                                                                                    for i     :=1to12do

LL                                      _|                                  forj :=5
                                                                                            for      k
                                                                                                             to10do
                                                                                                          := 15         downto
                                                                                                                            8 do
                                                                                               print              (i-        j)*k

a) How many times is the print statement executed?
                                                                           b) Which counting principle is used in part (a)?
                                                                     30. A sequence of letters of the form abcba, where the expres-
                                                                     sion is unchanged upon reversing order, is an example of a
                                                                     palindrome (of five letters). (a) If a letter may appear more than
                                                                     twice, how many palindromes of five letters are there? of six
                                                                     letters? (b) Repeat part (a) under the condition that no letter
                                                                     appears more than twice.
14             Chapter 1 Fundamental Principles of Counting

A       B                          G

H                          C         F

G                          D         E

F       E                          D

(a)                                  (b)                                (c)

Figure 1.6

31. Determine the number of six-digit integers (no leading ze-                    b) If two of the people insist on sitting next to each other,
ros) in which (a) no digit may be repeated; (b) digits may be                     how many arrangements are possible?
repeated. Answer parts (a) and (b) with the extra condition that             36. a) In how many ways can eight people, denoted A,
the six-digit integer is (i) even; (ii) divisible by 5; (iii) divisible          B,..., H be seated about the square table shown in Fig.
by 4.
                                                                                 1.6, where Figs. 1.6(a) and 1.6(b) are considered the same
32. a) Provide a combinatorial argument to show that if 7 and                    but are distinct from Fig. 1.6(c)?
     k are positive integers with n = 3k, then n!/(3!)* is an in-                 b) If two of the eight people, say A and B, do not get along
     teger.                                                                       well, how many different seatings are possible with A and
     b) Generalize the result of part (a).                                        B not sitting next to each other?

33. a) In how many possible ways could a student answer a                   37.   Sixteen people are to be seated at two circular tables, one
    10-question true-false test?                                            of which seats 10 while the other seats six. How many different
                                                                            seating arrangements are possible?
     b) In how many ways can the student answer the test in
     part (a) if it is possible to leave a question unanswered in           38. A committee of 15 —nine women and six men— is to be
     order to avoid an extra penalty for a wrong answer?                    seated at a circular table (with 15 seats). In how many ways can
                                                                            the seats be assigned so that no two men are seated next to each
34. How many distinct four-digit integers can one make from
                                                                            other?
the digits 1, 3, 3, 7, 7, and 8?
                                                                            39. Write a computer program (or develop an algorithm)
35. a) In how many ways can seven people be arranged about                  to determine   whether   there  is a_ three-digit integer
    a circular table?                                                       abc (= 100a + 10b + c) where abc = at + b'+ ct.

1.3
     Combinations: The Binomial Theorem
                                   The standard deck of playing cards consists of 52 cards comprising four suits: clubs, di-
                                   amonds, hearts, and spades. Each suit has 13 cards: ace, 2, 3, ... , 9, 10, jack, queen,
                                   king. If we are asked to draw three cards from a standard deck, in succession and without
                                   replacement, then by the rule of product there are

52
                                                                          X 51 X 50 = 2 = P(52,3
                                                                                      49!   ©, 9)
                                  possibilities, one of which is AH (ace of hearts), 9C (nine of clubs), KD (king of dia-
                                  monds). If instead we simply select three cards at one time from the deck so that the order
                                  of selection of the cards is no longer important, then the six permutations AH-9C-KD,
                                  AH-KD-9C, 9C-AH-KD, 9C-KD-AH, KD-9C-AH, and KD-AH-9C all correspond to
                                  just one (unordered) selection. Consequently, each selection, or combination, of three cards,
                                  with no reference to order, corresponds to 3! permutations of three cards. In equation form
                                                                1.3 Combinations: The Binomial Theorem          5

this translates into

(3!) X (Number of selections of size 3 from a deck of 52)
                                                       = Number of permutations of size 3 for the 52 cards
                                                                   52!
                                                       = P(52,3) = —.
                                                           (    )  49!
                  Consequently, three cards can be drawn, without replacement, from a standard deck in
               52!/(3! 49!) = 22,100 ways.

If we start with,.n distinct objects, each selection, or combination, of r of these objects,
                 with no reference to order, corresponds to r! permutations of size r from the n objects.
                 Thus the number of combinations of size r from a collection of size n is
                                                 Pia,r)     =        n!
                                      C(n,1) = ——— =
                                                                iG                Osrsn

In addition to C(n, r) the symbol (”) is also frequently used. Both C(n, r) and (") are
               sometimes read “n choose r.” Note that for all n > 0, C(n, 0) = C(n, n) = 1. Further, for
               alln > 1, C(n, 1) = C(n,n— 1) =n. When0 <n <r, then C(n, r) = (7) = 0.
                  A word to the wise! When dealing with any counting problem, we should ask ourselves
               about the importance of order in the problem. When order is relevant, we think in terms
               of permutations and arrangements and the rule of product. When order is not relevant,
               combinations could play a key role in solving the problem.

A hostess is having a dinner party for some members of her charity committee. Because
EXAMPLE 1.18
               of the size of her home, she can invite only 11 of the 20 committee members. Order is not
               important, so she can invite “the lucky 11” in C(20, 11) = (7°) = 20!/(11! 9!) = 167,960
               ways. However, once the 11 arrive, how she arranges them around her rectangular dining
               table is an arrangement problem. Unfortunately, no part of the theory of combinations and
               permutations can help our hostess deal with “the offended nine” who were not invited.

Lynn and Patti decide to buy a PowerBall ticket. To win the grand prize for PowerBall
EXAMPLE 1.19
               one must match five numbers selected from 1 to 49 inclusive and then must also match
               the powerball, an integer from | to 42 inclusive. Lynn selects the five numbers (between
               1 and 49 inclusive). This she can do in (%) ways (since matching does not involve order).
               Meanwhile Patti selects the powerball  — here there are (7) possibilities. Consequently, by
               the rule of product, Lynn and Patti can select the six numbers for their PowerBall ticket in
               (2) (7) = 80,089,128 ways.

a) A student taking a history examination is directed to answer any seven of 10 essay
EXAMPLE 1.20
                    questions. There is no concern about order here, so the student can answer the examina-
                    tion in
                                             IO}      10!        10x9x8
                                             7
16        Chapter 1 Fundamental Principles of Counting

b) If the student must answer three questions from the first five and four questions from
                                 the last five, three questions can be selected from the first five in (3) = 10 ways, and
                                 the other four questions can be selected in (3)   = 5 ways. Hence, by the rule of product,
                                 the student can complete the examination in (3)(3) = 10 X 5 = 50 ways.
                             c) Finally, should the directions on this examination indicate that the student must answer
                                seven of the 10 questions where at least three are selected from the first five, then there
                                are three cases to consider:
                                   i) The student answers three of the first five questions and four of the last five: By
                                        the rule of product this can happen in (3)(}) = 10 X 5 = 50 ways, as in part (b).
                                  ii) Four of the first five questions and three of the last five questions are selected by
                                      the student: This can come about in G) (3) = 5 X 10 = 50 ways — again by the
                                      rule of product.
                                 iii) The student decides to answer all five of the first five questions and two of the
                                      last five: The rule of product tells us that this last case can occur in (2) (6) =
                                      1 X 10 = 10 ways.

Combining      the results for cases (i), (11), and (iii), by the rule of sum we find that the
                           student can make (3)(3) + (3)(3) + (2)(3) = 50 + 50 + 10 = 110 selections of seven (out
                           of 10) questions where each selection includes at least three of the first five questions.

EXAMPLE   1.21          a) At Rydell High School, the gym teacher must select nine girls from the junior and
                :               senior classes for a volleyball team. If there are 28 juniors and 25 seniors, she can
                                 make the selection in ()     = 4,431,613,550 ways.
                             b) If two juniors and one senior are the best spikers and must be on the team, then the
                                 rest of the team can be chosen in (*?) = 15,890,700 ways.
                             c) For a certain tournament the team must comprise four juniors and five seniors. The
                                 teacher can select the four juniors in (2°) ways. For each of these selections she has
                                 (2) ways to choose the five seniors. Consequently, by the rule of product, she can
                                 select her team in (3) (2) = 1,087,836,750 ways for this particular tournament.

Some problems can be treated from the viewpoint of either arrangements or combina-
                           tions, depending on how one analyzes the situation. The following example demonstrates
                           this.

EXAMPLE   1.22   ]    The gym teacher of Example 1.21 must make up four volleyball teams of nine girls each
                :          from the 36 freshman girls in her PE. class. In how many ways can she select these four
                           teams? Call the teams A, B, C, and D.

a) To form team A, she can select any nine girls from the 36 enrolled in (?$} ways. For
                                 team B the selection process yields (7)) possibilities. This leaves ('3) and (3) possible
                                 ways to select teams C and D, respectively. So by the rule of product, the four teams
                                 can be chosen in

CS) No) (0) = (G2) (oe) (oe) (wo)
                                                                      _—            —              19
                                                                         1.3 Combinations: The Binomial Theorem              7

b) For an alternative solution, consider the 36 students lined up as follows:
                                        Ist         2nd                   3rd                        35th           36th
                                     student     student             student         —          student            student

To select the four teams, we must distribute nine A’s, nine B’s, nine C’s, and nine D’s in
               the 36 spaces. The number of ways in which this can be done is the number of arrangements
               of 36 letters comprising nine each of A, B, C, and D. This is now the familiar problem of
               arrangements of nondistinct objects, and the answer is
                                                     36!
                                                oT or oral           ;         as in part     (a).

Our next example points out how some problems require the concepts of both arrange-
               ments and combinations for their solutions.

The number of arrangements of the letters in TALLAHASSEE is
EXAMPLE 1.23
                                                           11!
                                                                                 = 831,600.
                                                 312)2'2! 1! 1!
               How many of these arrangements have no adjacent A’s?
                  When we disregard the A’s, there are

———___      = 5040
                                                     212!2! 111!
               ways to arrange the remaining letters. One of these 5040 ways is shown in the following
               figure, where the arrows indicate nine possible locations for the three A’s.

E,E,S,T,L,L,S,H

PPT PTT]
               Three of these locations can be selected in (3) = 84 ways, and because this is also possible
               for all the other 5039 arrangements of E, E, S, T, L, L, 8, H, by the rule of product there
               are 5040 < 84 = 423,360 arrangements of the letters in TALLAHASSEE with no consecu-
               tive A’s.

Before proceeding we need to introduce a concise way of writing the sum of a list of
               n+]   terms like dy, Gm41, Qm+2,..+,@m+4n,                       Where m and nv are integers and n > 0. This
               notation is called the Sigma notation because it involves the capital Greek letter £; we use
               it to represent a summation by writing
                                                                                                       m+n

Gin + Am4t    + Qm42              +++        TF amin      =     )     aj.
                                                                                                       =m

Here, the letter i is called the index of the summation, and this index accounts for all
               integers starting with the lower limit m and continuing on up to (and including) the upper
               limitm +h.
                  We may use this notation as follows.
                       7                                                  7
                  1) s> Gd, =a,+a4+d5          +a, +47           =        > a;, for there is nothing special about the
                      i=3                                                j=3
                     letter 7.
18         Chapter 1 Fundamental Principles of Counting

4                            4
                              2) So? = 1? +2? +3? 44? = 30 = 5° k’, because 0? = 0.
                                    i=l                                               k=0
                                     100                                                     101         99
                               3) S° P= 17 41274137
                                               +---4+ 1008 = SG -i = Skt).
                                    i=11                                                    yH=l2       k=10
                                     10                                                                                     10
                               4) 5°2i = 2(7) + 2(8) + 29) + 2(10) = 68 = 2(34) = 207 + 8 +9 + 10) = 2 Yi.
                                    i=7                                                                                    i=?
                                        3                   4             2
                               5)   >        a; = 43 >= y       aj) = >       Gy4}.
                                    i=3                 1=4           i=2
                                        5
                               6) \ia=atatat+ata=Sa.
                                    i=l

Furthermore, using this summation notation, we see that one can express the answer to
                            part (c) of Example 1.20 as

We shall find use for this new notation in the following example and in many other places
                            throughout the remainder of this book.

In the studies of algebraic coding theory and the theory of computer languages, we consider
     EXAMPLE 1.24
                            certain arrangements, called strings, made up from a prescribed alphabet of symbols. If the
                            prescribed alphabet consists of the symbols 0, 1, and 2, for example, then 01, 11, 21, 12,
                            and 20 are five of the nine strings of length 2. Among the 27 strings of length 3 are 000,
                            012, 202, and 110.
                               In general, if m is any positive integer, then by the rule of product there are 3” strings of
                            length x for the alphabet 0, 1, and 2. Ifx = x;x2.x3 - - - x, is one of these strings, we define the
                            weight of x, denoted wt(x), by wt(x) = x; + x2 + x3 -+---+Xx,. Forexample, wt(12) = 3
                            and wt(22) = 4 for the case where n = 2; wt(101) = 2, wt(210) = 3, and wt(222) = 6 for
                            n = 3.
                                Among the 3!° strings of length 10, we wish to determine how many have even weight.
                            Such a string has even weight precisely when the number of 1’s in the string is even.
                                There are six different cases to consider. If the string x contaifis no 1’s, then each of the
                             10 locations in x can be filled with either 0 or 2, and by the rule of product there are 2'° such
                            strings. When the string contains two 1’s, the locations for these two 1’s can be selected in
                            (‘2) ways. Once these two locations have been specified, there are 2° ways to place either 0
                            or 2 in the other eight positions. Hence there are ey) 2° strings of even weight that contain
                            two |’s. The numbers of strings for the other four cases are given in Table 1.2.

Table 1.2
                                            Number of 1’s_ | Number of Strings | Number of 1’s_ | Number of Strings

4                (19)2                          g          ()22
                                                    6                (5)2"                     10              (io)
                                                                     1.3 Combinations: The Binomial Theorem            19

Consequently, by the rule of sum, the number of strings of length 10 that have even
                 weight is 2! + (10)28 4 (19)26 4 (19)24 + (1) 22 4 (19) =y5_y (20)210-2n,

Often we must be careful of overcounting   —a situation that seems to arise in what
                 may appear to be rather easy enumeration problems. The next example demonstrates how
                 overcounting may come about.

EXAMPLE   1.25     a) Suppose that Ellen draws five cards from a standard deck of 52 cards. In how many
                      ways can her selection result in a hand with no clubs? Here we are interested in counting
                      all five-card selections such as
                        i)   ace of hearts, three of spades, four of spades,           six of diamonds,   and the jack of
                           diamonds.
                       ii) five of spades, seven of spades, ten of spades, seven of diamonds, and the king of
                           diamonds.
                      iii)   two of diamonds, three of diamonds, six of diamonds, ten of diamonds, and the
                             jack of diamonds.
                      If we examine this more closely we see that Ellen is restricted to selecting her five
                      cards from the 39 cards in the deck that are not clubs. Consequently, she can make her
                      selection in (%) ways.
                  b) Now suppose we want to count the number of Ellen’s five-card selections that contain
                     at least one club. These are precisely the selections that were not counted in part (a).
                     And since there are C3) possible five-card hands in total, we find that

52         39
                                       (’        ) ~ ( 5 ) = 2,598,960 — 575,757 = 2,023,203

of all five-card hands contain at least one club.
                   c) Can we obtain the result in part (b) in another way? For example, since Ellen wants to
                     have at least one club in the five-card hand, let her first select a club. This she can do in
                      (3) ways. And now she doesn’t care what comes up for the other four cards. So after
                      she eliminates the one club chosen from her standard deck, she can then select the
                      other four cards in Ci) ways. Therefore, by the rule of product, we count the number
                      of selections here as
                                                    13      1
                                                  ( |    CG)    = 13 X 249,900        = 3,248,700.

Something here is definitely wrong! This answer is larger than that in part (b) by more
                     than one million hands. Did we make a mistake in part (b)? Or is something wrong
                     with our present reasoning?
                         For example, suppose that Ellen first selects

the three of clubs

and then selects
                                                                 the five of clubs,
                                                                   king of clubs,
                                                                seven of hearts, and
                                                                  jack of spades.
20   Chapter 1 Fundamental Principles of Counting

If, however, she first selects

the five of clubs

and then selects

the three of clubs,
                                                                     king of clubs,
                                                                  seven of hearts, and

jack of spades,

is her selection here really different from the prior selection we mentioned? Unfortu-
                            nately, no! And the case where she first selects

the king of clubs

and then follows this by selecting

the three of clubs,

five of clubs,
                                                                  seven of hearts, and

jack of spades

is not different from the other two selections mentioned earlier.
                                Consequently, this approach is wrong because we are overcounting
                                                                                              — by consid-
                            ering like selections as if they were distinct.
                        d) But is there any other way to arrive at the answer in part (b)? Yes! Since the five-card
                            hands must each contain at least one club, there are five cases to consider. These are
                            given in Table 1.3. From the results in Table 1.3 we see, for example, that there are
                            (5) (7?) five-card hands that contain exactly two clubs. If we are interested in having
                            exactly three clubs in the hand, then the results in the table indicate that there are
                            (3)(2) such hands.

Table 1.3

Number of Ways         Number of         Number of Ways
                                   Number        to Select This        Cards That         to Select This
                                   of Clubs | Number of Clubs | Are Not Clubs | Number of Nonclubs

1              (13)                    4                (2
                                       2              Cs)                     3                (
                                       3              (3)                                      (3)
                                       4              (4)                                      (7)
                                       5              ('5)                                     (0)
                                                                                        1.3 Combinations: The Binomial Theorem                    21

Since no two of the cases in Table 1.3 have any five-card hand in common, the number
              of hands that Ellen can select with at least one club is

CVG) )G) GIG) (IG) +s))
                     2s")
                                           It
                                                 (13)(82,251) + (78)(9139) + (286)(741) + (715)(39) + (1287)(1)
                                                 2,023,203.

We shall close this section with three results related to the concept of combinations.
                   First we note that for integers n,r, with n > r > 0, (1) = (,,”,.). This can be established
              algebraically from the formula for ("), but we prefer to observe that when dealing with
              a selection of size r from a collection of n distinct objects, the selection process leaves
              behind n — r objects. Consequently, (') = (,,” ,) affirms the existence of a correspondence
              between the selections of size r (objects chosen) and the selections of size n — r (objects
              left behind). An example of this correspondence is shown in Table 1.4, where n = 5,r = 2,
              and the distinct objects are 1, 2, 3, 4, and 5. This type of correspondence will be more
              formally defined in Chapter 5 and used in other counting situations.

Table 1.4

Selections of Size r = 2                                   Selections of Size n — r = 3
                                           (Objects Chosen)                                           (Objects Left Behind)

l.      1,2               6.         2,4                   l.         3,4,5              6.   1,3,5
                                  2.       1,3               7.         2,5                   2.         2,4,5              7.   1,3,4
                                  3.       1,4               8.         3,4                   3.         2,3,5              8.   1,2,5
                                  4,       1,5               9.         3,5                   4.         2,3,4              9.   1,2,4
                                  5.       2,3           10.            4,5                   5.         1,4,5             10.   1,2,3

Our second result is a theorem from our past experience in algebra.

THEOREM 1.1   The Binomial Theorem. If x and y are variables and n is a positive integer, then

(x   +   yy"   _   ({)°9"          +   (That                +     (S)t                      4...

4 (,       _n   is"         | ly)         n
                                                                                                                    4 (")x"3°    _—    “fn
                                                                                                                                       > (7)   a kK

k=0

Before considering the general proof, we examine a special case. Ifn = 4, the coefficient
              of x*y* in the expansion of the product

(xt+y)@+y)@t+y) a+)
                                                                  Ist           2nd                3rd             4th
                                                             factor            factor          factor             factor
22        Chapter 1 Fundamental Principles of Counting

is the number of ways in which we can select two x’s from the four x’s, one of which is
                           available in each factor. (Although the x’s are the same in appearance, we distinguish them
                           as the x in the first factor, the x in the second factor, ... , and the x in the fourth factor.
                           Also, we note that when we select two x’s, we use two factors, leaving us with two other
                           factors from which we can select the two y’s that are needed.) For example, among the
                           possibilities, we can select (1) x from the first two factors and y from the last two or (2) x
                           from the first and third factors and y from the second and fourth. Table 1.5 summarizes the
                           six possible selections.

Table 1.5

Factors Selected for x                          Factors Selected for y

(1)            1,2                             (1)            3,4
                                                         (2)            1,3                             (2)            2,4
                                                         (3)            1,4                             (3)            2,3
                                                         (4)            2,3                             (4)            1,4
                                                         (5)            2,4                             (5)            1,3
                                                         (6)            3,4                             (6)            1,2

Consequently, the coefficient of x” y? in the expansion of (x + y)* is (5) = 6, the number
                           of ways to select two distinct objects from a collection of four distinct objects.
                              Now we turn to the proof of the general case.
                           Proof: In the expansion of the product

(x+y) @+y)@+y)---
                                                                             t+ y)
                                                                Ist            2nd            3rd              ath
                                                               factor         factor         factor           factor

the coefficient of x y"~*, where 0 < k <n, is the number of different ways in which we
                           can select k x’s [and consequently (n — k) y’s] from the n available factors. (One way, for
                           example, is to choose x from the first k factors and y from the last n — k factors.) The total
                           number of such selections of size k from a collection of size n is C(n, k) = (i), and from
                           this the binomial theorem follows.

In view of this theorem, () is often referred to as a binomial coefficient. Notice that it
                           is also possible to express the result of Theorem 1.1 as

“           n
                                                               (x+y) = » (, — jot
                                                                                       k=0

a) From the binomial theorem it follows that the coefficient of x* y’ in the expansion of
     EXAMPLE 1.26
                                  (x + y)’ is 3) = (§) = 21.
                              b) To obtain the coefficient of a*b? in the expansion of (2a — 3b)’, replace 2a by x and
                                  —3b by y. From the binomial theorem the coefficient of x*y? in (x + y)’ is (2), and
                                  (2)x° y? = (2)(2a)5(—3b)? = (2)(2)°(—3)*a*b? = 6048a°D?.
                                                                                1.3 Combinations: The Binomial Theorem                     23

COROLLARY 1.1     For each integer n > 0,

a) () + (i) +) +--+ + (2) = 2", and
                    b) (0) —G) + G)—    + CD") = 0.
                  Proof: Part (a) follows from the binomial theorem when we set x = y = 1. When x = —1
                  and y = 1, part (b) results.

Our third and final result generalizes the binomial theorem and is called the multinomial
                  theorem.

THEOREM 1.2                                                        Ry  2
                  For positive integers n, t, the coefficient of x}'x,?x;° 3
                                                                             - + - x; in the expansion of
                  (x1 +.x2 +243 +++++-x;)” is
                                                                                 n!
                                                                                                  >
                                                                ni!no!n3!---n,!

where each n; is an integer with 0 <n;                    <n,        for all 1 <i<t,              anda;   +n2+n34+-+-+
                  ny = Nn.
                  Proof: As in the proof of the binomial theorem, the coefficient of x}1x;?x4° +--+ x;" is the
                  number of ways we can select x; from, of the n factors, x2 from n2 of then — n; remaining
                  factors, x; from n3 of the n — n, — n2 now remaining factors, ..., and x; from n, of the
                  lastn —n) —nz — 13 —---—n,_; =n, remaining factors. This can be carried out, as in
                  part (a) of Example 1.22, in
                               n\({n—-—ny\(n—-Any-m                                     Nh—-Ahy —          2 — Ng            Ny]
                               Hy         n2               N3                                                 ny;
                  ways. We leave to the reader the details of showing that this product is equal to
                                                                                 n!
                                                                nyinogtngt--- ay

which is also written as
                                                                                  nh

Hy,        2,   73,...,      M4

and is called a multinomial coefficient. (When t = 2 this reduces to a binomial coefficient.)

a) In the expansion of (x + y + z)’ it follows from the multinomial theorem that the
   EXAMPLE 1.27
                        coefficient of x? y*z3 is (,33) = 44 = 210, while the coefficient of xyz? is (, {.5) =
                        42 and that of 3,4 x°z* is3 (, 44) =-_7!
                                                              yom == 35.
                    b) Suppose      we    need       to know       the coefficient                of a*b*c?d°           in the expansion    of
                        (a + 2b — 3c + 2d +5)'°. If we replace a by v, 2b by w, —3c by x, 2d by y, and
                        5 by z, then we can apply the multinomial theorem to (v+w+x+y+z)!®
                        and   determine        the   coefficient      of        v?w*x?y°z*            as    (53.5'5.4) = 302,702,400.      But
                        (4.3. 185.4) (a)?(2b)3(—3c)?(2d)9
                                                      (5)* = (5.5185 4)(1)7(2)3(—3)7(2)9
                                                                                 (5) (a2 bed?) =
                        435,891,456,000,000 a*b3c7a?.
24                          Chapter 1 Fundamental Principles of Counting

6. Ifn is a positive integer and n > 1, prove that (5) + (” > ')
                                             EXERCISES 1.3                                     is a perfect square.
                                                                                                 7. Acommittee of 12 is to be selected from 10 men and 10
  1. Calculate (5) and check your answer by listing all the se-
                                                                                               women. In how many ways can the selection be carried out if
lections of size 2 that can be made from the letters a, b, c, d, e,
                                                                                               (a) there are no restrictions? (b) there must be six men and six
and f.
                                                                                               women? (c) there must be an even number of women? (d) there
2. Facing a four-hour bus trip back to college, Diane decides to                              must be more women than men? (e) there must be at least eight
take along five magazines from the 12 that her sister Ann Marie                                men?
has recently acquired. In how many ways can Diane make her                                      8. In how many ways can a gambler draw five cards from a
selection?                                                                                     standard deck and get (a) a flush (five cards of the same suit)?
                                                                                               (b) four aces? (c) four of a kind? (d) three aces and two jacks?
3. Evaluate each of the following.                                                            (e) three aces and a pair? (f) a full house (three of a kind anda
     a) C(10,4)                         —b) (7)           ~—e) C14, 12) _— dd) (18)            pair)? (g) three of a kind? (h) two pairs?
4. In the Braille system a symbol, such as a lowercase letter,                                 9, How many bytes contain (a) exactly two 1’s; (b) exactly
punctuation mark, suffix, and so on, is given by raising at least                              four 1’s; (c) exactly six 1’s; (d) at least six 1’s?
one of the dots in the six-dot arrangement shown in part (a) of                                10. How many ways are there to pick a five-person basketball
Fig. 1.7. (The six Braille positions are labeled in this part of                               team from 12 possible players? How many selections include
the figure.) For example, in part (b) of the figure the dots in                                the weakest and the strongest players?
positions | and 4 are raised and this six-dot arrangement repre-
                                                                                               11. Astudent is to answer seven out of 10 questions on an exam-
sents the letter c. In parts (c) and (d) of the figure we have the
                                                                                               ination. In how many ways can he make his selection if (a) there
representations for the letters m and t, respectively. The definite
                                                                                               are no restrictions? (b) he must answer the first two questions?
atticle “the” is shown in part (e) of the figure, while part (f)
                                                                                               (c) he must answer at least four of the first six questions?
contains the form for the suffix “ow.” Finally, the semicolon,
;, 1S given by the six-dot arrangement in part (g), where the dots                             12. In how many ways can 12 different books be distributed
at positions 2 and 3 are raised.                                                               among four children so that (a) each child gets three books?
                                                                                               (b) the two oldest children get four books each and the two
                                                                                               youngest get two books each?
            1°             °4           e           @          e        @         -   ©@       13. How many arrangements of the letters in MISSISSIPPI
                                                                                               have no consecutive S’s?
           2     e         e 5          °           °          .        °         @        e
                                                                                               14, A gym coach must select 11 seniors to play on a football
           3     °         . 6          e           °          @        °         @        °
                                                                                               team. If he can make his selection in 12,376 ways, how many
                                                                                               seniors are eligible to play?
                                                                                               15. a) Fifteen points, no three of which are collinear, are given
     (a)                         (b)           "c       ()         "m       (d)       "t           on a plane. How many lines do they determine?
                 -         @            +           @          .        °                          b) Twenty-five points, no four of which are coplanar, are
                                                                                                   given in space. How many triangles do they determine?
               @           -            e@          -          e        -                          How many planes? How many tetrahedra (pyramidlike
                                                                                                   solids with four triangular faces)?
               e.6.°8                   -           @          @   -:
                                                                                               16. Determine the value of each of the following summations.
                                                                                                        6                        2                10
      (e)               “the”    j(f)        “ow”       |(g)                                       a WH)                  db) VP         -D   O DU +CDI
                                                                                                       :=]                     j=n-2             1=0
     Figure 1.7                                                                                         2n

d) Yor 1)‘, where n is an odd positive integer
     a) How many different symbols can we represent in the                                             k=n

Braille system?
     b) How many symbols have exactly three raised dots?                                           e) So i(-1
                                                                                                       1=1]
     c) How many symbols have an even number of raised dots?                                   17. Express each of the following using the summation (or
5. a) How many permutations of size 3 can one produce with                                    Sigma) notation. In parts (a), (d), and (e), 2 denotes a positive
     the letters m, r, a, f, and t?                                                            integer,
     b) List all the combinations of size 3 that result for the                                      l |  l     1
                                                                                                   Matatgatots:                        n>2
     letters m, r, a, f, and t.
                                                                                          1.3 Combinations: The Binomial Theorem                            25

b)    1+44+94
             164+ 25+ 364+ 49                                          21. How many triangles are determined by the vertices of a
    ce) F-24337 -445-647                                               regular polygon of n sides? How many if no side of the polygon
          1       2         3                n+1                       is to be a side of any triangle?
    d)    —-+ —— + ——                 +...
         sata    tae                          2n                       22. a) In the complete expansion of (a+b+e+4+d)-
                 n+l            n+2           n+3                          (e+ f+tgtA\(utvu+w+x«+y+z)       one obtains the

on (“')+(                    rt )-("e)+                                  sum of terms such as agw, cfx, and dgv. How many such
                                                                             terms appear in this complete expansion?

+o (Sa)
                       2n
                                                                             b) Which of the following terms do not appear in the com-
                                                                             plete expansion from part (a)?
18. For the strings of length 10 in Example 1.24, how many
                                                                                   i) afx          li) bux            iii) chz
have (a) four 0’s, three 1’s, and three 2’s; (b) at least eight 1’s;
                                                                                 iv) egw            v) egu           vi) dfz
(c) weight 4?
                                                                       23. Determine the coefficient of x°y* in the expansions of
19, Consider the collection of all strings of length 10 made up        (a) (x + y)"*, (b) ( + 2y)”, and (c) (2x — 3y)!*.
from the alphabet 0, 1, 2, and 3. How many of these strings            24. Complete            the details in the proof of the multinomial
have weight 3? How many have weight 4? How many have                   theorem.
even weight?
                                                                       25. Determine the coefficient of
20. In the three parts of Fig. 1.8, eight points are equally spaced          a) xyz7in(x+y+z)4
and marked on the circumference of a given circle.                           b) xyz? in(w+x+y+2z)4
                                                                             c) xyz” in (2x ~— y — z)*
                                                                             d) xyz? in (x — 2y + 3z7')*
                                                                             e) wx? yz? in (2w — x + 3y — 2z)8
                                                                       26. Find the coefficient of w7x?y*z?                             in the expansion of
                                                                       (a) (wtetytz4t1)", (b) Qw—x+3y+z2—2)",                                               and
                                                                       (c)(v+w—2x+y+5z24+3).
                                                                       27. Determine           the sum of all the coefficients in the expan-
                                                                       sions of
                                                                             a) (x + y)?                      b) (x + y)"°                   c) (x t+y+4+z)'°
                                                                             d) (wtx+y+4+z)
                                                                             e) (25 — 34+ 5u + 6v — llw + 3x + 2y)!°
                                                                       28. For any positive integer n determine
                                                                                     na         1                                 fn        (~   1)'

a       —___—                                  b           —_——.
                                                                                  ) » il(n — i)!                                ) »     it(n — i)!

2") =me0(2)
                                                                       29.   Show that for all positive integers m and n,
                                                                                                        m+n                            m+n

Figure 1.8

()eaC)eaC)eove2Q)onr2()
                                                                       30. With 1 a positive integer, evaluate the sum
    a) For parts (a) and (b) of Fig. 1.8 we have two different
    (though congruent) triangles. These two triangles (distin-
    guished by their vertices) result from two selections of size
                                                                       31. For x areal number and » a positive integer, show that
    3 from the vertices A, B, C, D, E, F, G, H. How many dif-
    ferent (whether congruent or not) triangles can we inscribe              a)      ~
                                                                                    l=(1+x)" n     A
                                                                                               — (1)                 1
                                                                                                                         (1 + x) a~t
    in the circle in this way?
                                                                             4     (5)2u            +   xy?    eee          (—1)"("x
    b) How many different quadrilaterals can we inscribe in the
                                                                                    2                                                   n
    circle, using the marked vertices? [One such quadrilateral
    appears in part (c) of Fig. 1.8.]                                        b) 1 =(24x)"— (‘es +)Q+x)""!
    c) How many different polygons of three or more sides can
    we inscribe in the given circle by using three or more of the            + (5) (e+ IP Q+xy? e+ cir("Yos +1)"
    marked vertices?
26                  Chapter 1 Fundamental Principles of Counting

c)         =   (24+.x)"   _   (7)x10      4x)!                                   b)   Given   a     list—       ao,    a|, 42,...,     a, — of     n+1.   real
                                     1                                                 numbers,     where         n     is    a   positive    integer,    determine

+ (5)xe i                                cay (")x"                               wG@ -4,.1).
         2                                             Rr
                                                                                         c) Determine the value of )°!% (4, - -4).
32. . Determine x if }7>°,OG?(°)8' = x! ,                                        34.   a) Write a computer program (or eT      develop an algorithm)
33.   oe        a, &, 4, a 1s a list of four real numbers, what is                     that lists all selections of size 2 from the objects 1, 2, 3, 4,
           p=        ~ G1)?                                                            5, 6.
                                                                                       b) Repeat part (a) for selections of size 3.

1.4
                Combinations with Repetition
                                          When repetitions are allowed, we have seen that for » distinct objects an arrangement of
                                          size r of these objects can be obtained in n” ways, for an integer r > 0. We now turn to
                                          the comparable problem for combinations and once again obtain a related problem whose
                                          solution follows from our previous enumeration principles.

On their way home from track practice, seven high school freshmen stop at a restaurant,
       EXAMPLE 1.28
                                          where each of them has one of the following: a cheeseburger, a hot dog, a taco, or a fish sand-
                                          wich. How many different purchases are possible (from the viewpoint of the restaurant)?
                                              Let c, h, t, and f represent cheeseburger, hot dog, taco, and fish sandwich, respectively.
                                          Here we are concerned with how many of each item are purchased, not with the order
                                          in which they are purchased, so the problem is one of selections, or combinations, with
                                          repetition.
                                              In Table 1.6 we list some possible purchases in column (a) and another means of repre-
                                           senting each purchase in column (b).

Table 1.6

l.     c,c,h,h,t,tf                l       xx |xx|[xx
                                                                                                                     [x
                                                                 2.     c,c,c,c,h,tf                2.      XXxXx|x|x{
                                                                                                                     {x
                                                                 3.     ¢,¢,c,c,¢,¢,f               3.      xxxxxx|||x
                                                                 4.     h,t,t,
                                                                           f, f, f, f               4.      |x|xx|]xxxx
                                                                 5. t,t, t,t, t, f, f               5.      |}xXxxxx|xx
                                                                 6. ttt ttt t                       6.      ||xXxxxxxx|
                                                                 7. £,f, f, f, f, f, f              7.      ||| xxxxxxx
                                                                 (a)                                (b)

For a purchase in column (b) of Table 1.6 we realize that each x to the left of the first bar
                                          (| ) represents ac, each x between the first and second bars represents an h, the x’s between
                                          the second and third bars stand for t’s, and each x to the right of the third bar stands for
                                          an f. The third purchase, for example, has three consecutive bars because no one bought
                                          a hot dog or taco; the bar at the start of the fourth purchase indicates that there were no
                                          cheeseburgers in that purchase.
                                               Once again a correspondence has been established between two collections of objects,
                                           where we know how to count the number in one collection. For the representations in
                                                                       1.4 Combinations with Repetition        27

column (b) of Table 1.6, we are enumerating all arrangements of 10 symbols consisting
               of seven x’s and three |’s, so by our correspondence the number of different purchases for
               column (a) is

10!       10
                                                       7! 3!      7}

In this example we note that the seven x’s (one for each freshman) correspond to the size
               of the selection and that the three bars are needed to separate the 3 + 1 = 4 possible food
               items that can be chosen.

When we wish to select, with repetition, r of n distinct objects, we find (as in Table 1.6)
                that we are considering all arrangements of r x’s andn — 1 {’s and that their mimber is
                                               (n+r—1)!        =("tro")
                                                ri(n — 1)!          r   .
                Consequently, the number of combinations of n objects taken r at a time, with repetition,
                 isC{(n+r-~-,r).

(In Example     1.28, n = 4, r = 7, so it is possible for r to exceed n when repetitions are
               allowed.)

A donut shop offers 20 kinds of donuts. Assuming that there are at least a dozen of each kind
EXAMPLE 1.29
               when we enter the shop, we can select  a dozen donuts in C(20 + 12 — 1, 12) = C(31, 12) =
               141,120,525 ways. (Here n = 20, r = 12.)

President Helen has four vice presidents: (1) Betty, (2) Goldie, (3) Mary Lou, and (4) Mona.
EXAMPLE 1.30
               She wishes to distribute among them $1000 in Christmas bonus checks, where each check
               will be written for a multiple of $100.
                 a) Allowing    the situation in which one or more of the vice presidents get nothing,
                    President Helen is making     a selection of size     10 (one for each unit of $100)   from
                    a collection   of size 4 (four vice presidents),     with   repetition. This can be done    in
                    C(4+     10 — 1, 10) = C(13, 10) = 286 ways.
                 b) If there are to be no hard feelings, each vice president should receive at least $100. With
                    this restriction, President Helen is now faced with making a selection of size 6 (the
                    remaining six units of $100) from the same collection of size 4, and the choices now
                    number C(4 + 6 — 1, 6) = C(9, 6) = 84. [For example, here the selection 2, 3, 3, 4,
                    4, 4 is interpreted as follows: Betty does not get anything extra— for there is no | in
                    the selection. The one 2 in the selection indicates that Goldie gets an additional $100.
                    Mary Lou receives an additional $200 ($100 for each of the two 3’s in the selection).
                    Due to the three 4’s, Mona’s bonus check will total $100 + 3($100) = $400.]
28         Chapter 1 Fundamental Principles of Counting

c) If each vice president must get at least $100 and Mona, as executive vice president,
                                 gets at least $500, then the number of ways President Helen can distribute the bonus
                                 checks is
                                        c342-1,2)+C3+1-1,1)+C3+0-1,0)=10=C(4+2-1,2)
                                               ~             \       ~-                        -                                       en

Mona gets              Mona gets              Mona gets                         Using the
                                          exactly $500           exactly $600           exactly $700                 technique in part (b)

Having worked examples utilizing combinations with repetition, we now consider two
                            examples involving other counting principles as well.

In how many ways can we distribute seven bananas and six oranges among four children
     EXAMPLE 1.31
                            so that each child receives at least one banana?
                                After giving each child one banana, consider the number of ways the remaining three
                            bananas can be distributed among these four children. Table 1.7 shows four of the distri-
                            butions we are considering here. For example, the second distribution in part (a) of Ta-
                            ble 1.7 —namely, 1, 3, 3—indicates that we have given the first child (designated by 1)
                            one additional banana and the third child (designated by 3) two additional bananas. The
                            corresponding arrangement in part (b) of Table 1.7 represents this distribution in terms of
                            three b’s and three bars. These six symbols — three of one type (the b’s) and three others of a
                            second type (the bars)    — can be arranged in 6!/(3! 3!) = C(6, 3) = C(444+3 — 1, 3) = 20
                            ways. [Here n = 4, r = 3.] Consequently, there are 20 ways in which we can distribute
                            the three additional bananas among these four children. Table 1.8 provides the compa-
                            rable situation for distributing the six oranges. In this case we are arranging nine sym-
                            bols  — six of one typé (the o’s) and three of a second type (the bars). So now we learn
                            that the number of ways we can distribute the six oranges among these four children is
                            91/(6! 3!) = C(O, 6) = C(44+6— 1, 6) = 84 ways. [Heren = 4,r = 6.] Therefore, by the
                            rule of product,       there are 20 X 84 = 1680 ways            to distribute the fruit under        the stated
                            conditions.

Table 1.7                                           Table 1.8

1)   1,2,3             1)   bl bib                  1)  1,2,2,3,3,4              1)      olooloo|o
                              2)   1,3,3             2)   b| |bb|                 2)   1,2,2,4,4,4             2)      oloo||ooo
                              3)   3,4,4             3)   ||b|bb                  3) 2,2, 2,3,3,3              3)      looolooeo|
                              4)   4,4,4             4)   |||bbb                  4)   4,4,4,4,4,4             4)      |||eoo000
                             (a)                   (b)                          (a)                            (b)

A message is made up of 12 different symbols and is to be transmitted through a com-
     EXAMPLE 1.32
                            munication channel. In addition to the 12 symbols, the transmitter will also send a                          total
                            of 45 (blank) spaces between the symbols, with at least three spaces between each pair of
                            consecutive symbols. In how many ways can the transmitter send such a message?
                                There are | 2! ways to arrange the 12 different symbols, and for each of these arrangements
                            there are 11 positions between the 12 symbols. Because there must be at least three spaces
                            between successive symbols, we use up 33 of the 45 spaces and must now locate the
                            remaining |2 spaces. This is now a selection, with repetition, of size 12 (the spaces) from a
                            collection of size 11 (the locations), and this can be accomplished in C(11 + 12 — 1, 12) =
                            646,646 ways.
                                                                    1.4 Combinations with Repetition             29

Consequently, by the rule of product the transmitter can send such messages with the
               required spacing in (12!)(75) = 3.097 x 10!* ways.

In the next example an idea is introduced that appears to have more to do with number
               theory than with combinations or arrangements. Nonetheless, the solution of this example
               will turn out to be equivalent to counting combinations with repetitions.

Determine all integer solutions to the equation
EXAMPLE 1.33
                              Xp t¢x24+43    444   = 7,      where
                                                                x, > 0          forall]
                                                                                    <i <4.

One solution of the equation is x; = 3, x2 = 3, x3 = 0, x4 = 1. (This is different from a
               solution such as x; = 1,x2 = 0,x3 = 3,x4 = 3,even though the same four integers are being
               used.) A possible interpretation for the solution x} = 3, x2 = 3,x3 = 0, x4 = 1 is that we are
               distributing seven pennies (identical objects) among four children (distinct containers), and
               here we have given three pennies to each of the first two children, nothing to the third child,
               and the last penny to the fourth child. Continuing with this interpretation, we see that each
               nonnegative integer solution of the equation corresponds to a selection, with repetition, of
               size 7 (the identical pennies) from a collection of size 4 (the distinct children), so there are
               C(4+7 —-1, 7) = 120 solutions.

At this point it is crucial that we recognize the equivalence of the following:         _   Ae
                    a) The number of integer solutions of the equation                 |
                                      Xp txts         +x,   =P,       xj > 0,        i sis it.    . a

b) The number of selections, with repetition, of size r from a collection of size n.°
                    c) The number of ways r identical objects can be distributed among x distinct
                       containers.

In terms of distributions, part (c) is valid only when the r objects being distributed are
               identical and the » containers are distinct. When both the r objects and the n containers
               are distinct, we can select any of the n containers for each one of the objects and get n’”
               distributions by the rule of product.
                   When the objects are distinct but the containers are identical, we shall solve the problem
               using the Stirling numbers of the second kind (Chapter 5). For the final case, in which both
               objects and containers are identical, the theory of partitions of integers (Chapter 9) will
               provide some necessary results.

In how many ways can one distribute 10 (identical) white marbles among six distinct
EXAMPLE 1.34
               containers?
                   Solving this problem is equivalent to finding the number of nonnegative integer solutions
               to the equation x; + x2 +--++ x6 = 10. That number is the number of selections of size 10,
               with repetition, from a collection of size 6. Hence the answer is C(6 + 10 — 1, 10) = 3003.

We now examine two other examples related to the theme of this section.
30         Chapter 1 Fundamental Principles of Counting

From Example 1.34 we know that there are 3003 nonnegative integer solutions to the
     EXAMPLE 1.35
                            equation x; + x2 +---+ x6 = 10. How many such solutions are there to the inequality
                            X, $x2+---+x6<              10?
                                One approach that may seem feasible in dealing with this inequality is to determine
                            the number of such solutions to x; + x2 +---+.x%6 =k, where k is an integer and 0 <
                            k <9. Although feasible now, the technique becomes unrealistic if 10 is replaced by a
                            somewhat larger number, say 100. In Example 3.12 of Chapter 3, however, we shall estab-
                            lish a combinatorial identity that will help us obtain an alternative solution to the problem
                            by using this approach.
                                For the present we transform the problem by noting the correspondence between the
                            nonnegative integer solutions of

Xp +x2+---+ 4x6 < 10                                   (1)
                            and the integer solutions of

Xytxo   +--+     +x6   + x7    = 10,      0 < x;,     1<i   <6,       0 < x7.        (2)

The number of solutions of Eq. (2) is the same as the number of nonnegative integer
                            solutions of yj + yo +:-:+y6 + y7 = 9, where y; = x, for 1 <i <6, and yj = x7 - 1.
                            This is C(7 + 9 — 1, 9) = 5005.

Our next result takes us back to the binomial and multinomial expansions.

In the binomial expansion for (x + y)", each term is of the form (i )x* tk so the total
     EXAMPLE 1.36
                            number of terms in the expansion is the number of nonnegative integer solutions of n; +
                            ny = n(n, is the exponent for x, m2 the exponent for y). This number is C(2 +n           — 1, n) =
                            n+l,
                               Perhaps it seems that we have used a rather long-winded argument to get this result.
                            Many of us would probably be willing to believe the result on the basis of our experiences
                            in expanding (x + y)” for various small values of x.
                               Although experience is worthwhile in pattern recognition, it is not always enough to find
                            a general principle. Here it would prove of little value if we wanted to know how many
                            terms there are in the expansion of (w + x + y +z)!
                               Each distinct term here is of the form               (,.,,",,,,)w™x?y™z™,     where 0 <n,     for
                            1<i    <4,andn,;     +n2      +73   +74     = 10. This last equation can be solved in C(4+   10 — 1,
                            10) = 286 ways, so there are 286 terms in the expansion of (w + x + y +z)!®.

And now once again the binomial expansion will come into play, as we find ourselves
                            using part (a) of Corollary 1.1

a) Let us determine all the different ways in which we can write the number 4 as a sum
     EXAMPLE 1.37
                                 of positive integers, where the order of the summands is considered relevant. These
                                 representations are called the compositions of 4 and may be listed as follows:
                                  1) 4                                           5)2+1+1
                                  2)3+4+1                                        6) 14+2+1
                                  3) 1+3                                         7) 1+1+4+2
                                  4)2+2                                          8) 1+14+1+1
                                                     1.4 Combinations with Repetition              31

Here we include the sum consisting of only one summand — namely, 4. We find that
  for the number 4 there are eight compositions in total. (If we do not care about the order
  of the summands, then the representations in (2) and (3) are no longer considered to be
  different — nor are the representations in (5), (6), and (7). Under these circumstances
   we find that there are five partitions for the number           4— namely,         4; 3 + 1; 2 +2;
  2+1+1;and1+1+1                +1. We    shall learn more about partitions of positive integers
  in Section 9.3.)
b) Now suppose that we wish to count the number of compositions for the number 7.
  Here we do not want to list all of the possibilities — which include 7; 6 + 1; 1+6;
  $+2;14+2+4,2+4+1;                and3+1+2-+1.           To count all of these compositions,
  let us consider the number of possible summands.
    i) For one summand there is only one composition—— namely, 7.
   ii) If there are two (positive) summands, we want to count the number of integer
       solutions for

w,
                                 + uw. =7,           where wy), Wo > 0.

This is equal to the number of integer solutions for

xX, $x.    =5,        where
                                                        x}, X2 > 0.

The number of such solutions is (7*2~ ') = (8).
  iii)   Continuing with our next case, we examine the compositions with three (positive)
         summands. So now we want to count the number of positive integer solutions for

yityot+y3=7.
         This is equal to the number of nonnegative integer solutions for

Zi +22 +23 =4,
         and that number is C + ~ ') = (8).
  We summarize cases (1), (ii), and (iii), and the other four cases in Table 1.9, where we
  recall for case (i) that 1 = (¢).

Table 1.9

n = The Number of Summands | The Number of Compositions
                  in a Composition of 7      of 7 with n Summands

(1)   n=]                         (i)        (@)
                         (ii)   n=2                        (ii)        ( 65
                                                                              eee”

(iii)   n=3                       (iii)        (
                                                                              ee

(iv)   n=4                         (iv)       (§
                                                                              ee

(v)   n=5                          (v)       (5
                                                                              Nee

(vi)   n=6                         (vi)       (°
                                                                              Ne

(vii)   n=7                        (vii)       (5
                                                                              Newer
32         Chapter 1 Fundamental Principles of Counting

Consequently, the results from the right-hand side of our table tell us that the (total)
                            number of compositions of 7 is

()+()+@) OQ) +@)-E0)
                               From part (a) of Corollary 1.1 this reduces to 2°.                                  |
                               In general, one finds that for each positive integer m, there are )>7=J (", ') =2"-!
                            compositions.

From Example 1.37 we know that there are 2'?~' = 2'! = 2048 compositions of 12. If
     EXAMPLE 1.38
                            our interest is in those compositions where each summand is even, then we consider, for
                            instance, compositions such as
                                            2+4+6=2(14+2+3)                          2+8+2=20+4+1)
                                            84+2+2=2(4+1+4+1)                          6+ 6 = 2(3 + 3).
                            In each of these four examples, the parenthesized expression is a composition of 6. This
                            observation indicates that the number of compositions of 12, where each summand is even,
                            equals the number of (all) compositions of 6, which is 2°-'! = 2° = 32.

Our next two examples provide applications from the area of computer science. Further-
                            more, the second example will lead to an important summation formula that we shall use
                            in many later chapters.

Consider the following program segment, where i, 7, and & are integer variables.
     EXAMPLE 1.39
                                                                    fori :=1   to 20 do
                                                                      for j :=1toido
                                                                         for k :=1tojdo
                                                                            print (i* 7 +k)

How many times is the print statement executed in this program segment?
                                Among the possible choices fori, j, and & (in the order i—first, ;-second, k—third) that
                            will lead to execution of the print statement, we list (1) 1, 1, 1; (2) 2, 1, 1; (3) 15, 10, 1;
                            and (4) 15, 10, 7. We note that 7 = 10, 7 = 12, k =5           is not one of the selections to be
                            considered, because j = 12 > 10 =i; this violates the condition set forth in the second
                            for loop. Each of the above four selections where the print statement is executed satisfies
                            the condition   1 <k   <j     <i < 20. In fact, any selection a, b, c (a <b <c) of size 3, with
                            repetitions allowed, from the list 1, 2, 3, ..., 20 results in one of the correct selections:
                            here, k = a, j = b,i = c. Consequently the print statement is executed
                                                               20     —]        22
                                                           (               ) = (5)   = 1540 times.

If there had been r (> 1) for loops instead of three, the print statement would have been
                            executed (7°+” ~ ') times.

Here we use a program segment to derive a summation formula. In this program segment,
     EXAMPLE 1.40
                            the variables i, 7, n, and counter are integer variables. Furthermore, we assume that the
                            value of n has been set prior to this segment.
                                                                               1.4 Combinations with Repetition       33

counter         :=0
                                                   for i:=1tondo
                                                     for j :=1toido
                                                           counter        := counter+1

From the results in Example 1.39, after this segment is executed the value of (the variable)
               counter will be (" +3 7 ') = (" 3 '). (This is also the number of times that the statement

(*)                                        counter        := counteri+1

is executed.)
                     This result can also be obtained as follows: When i := 1, then j varies from 1 to 1 and
               (*) is executed once; when i is assigned the value 2, then j varies from | to 2 and (*) is
               executed twice; j varies from | to 3 when i is assigned the value 3, and (*) is executed three
               times; in general, for 1 < k <n, wheni := k, then / varies from | to k and (*) is executed
               k times. In total, the variable counter is incremented                [and the statement (*) is executed]
               1+2+3+---+n times.
                 Consequently,
                                                                                      1               1

Die te reste tan ("F
                                     ff

)-              )
                                    i=]
                                                                                                  2
                  The derivation of this summation formula, obtained by counting the same result in two
               different ways, constitutes a combinatorial proof.

Our last example     for this section introduces the idea of a run, a notion that arises in
               statistics —in particular, in the detecting of trends in a statistical process.

The counter at Patti and Terri’s Bar has 15 bar stools. Upon entering the bar Darrell finds
EXAMPLE 1.41
               the stools occupied as follows:

OOEQOOQOOQOOEEEOOOE
                                                            O,

where O indicates an occupied stool and E an empty one. (Here we are not concerned with
               the occupants of the stools, just whether or not a stool is occupied.) In this case we say that
               the occupancy of the 15 stools determines seven runs, as shown:

O00,      E     OOOO          EEE OOO      E   OO
                                           Se              ee            ee          eer ee
                                            Run   = Run      Run         Run   Run    Run = Run

In general, a run is a consecutive list of identical entries that are preceded and followed by
               different entries or no entries at all.
                  A second way in which five E’s and 10 O’s can be arranged to provide seven runs is

EQOOQOQEEQQEOQOQOOOE.

We want to find the total number of ways five E’s and 10 O’s can determine seven runs.
               If the first run starts with an E, then there must be four runs of E’s and three runs of O’s.
               Consequently, the last run must end with an E.
                     Let x; count the number of E’s in the first run, x2. the number of O’s in the second run,
               x3 the number of E’s in the third run, ... , and x7 the number of E’s in the seventh run. We
               want to find the number of integer solutions for

X) +x3 4x5 +x7 =5,                   X1,X3,X5,X7 > 0                        (3)
34             Chapter 1 Fundamental Principles of Counting

and

X2+x4+x6         = 10,      X2,%4,       %6   >   Q.                               (4)

The number of integer solutions for Eq. (3) equals the number of integer solutions for

yitystyst+y7
                                                                       = 1,                   Yi. ¥3, Ys, ¥7 =O.
                                This number is (¢+ t —t ) = (7) = 4. Similarly, for Eq. (4), the number of solutions is
                                 C + ; 7 ') = (5) = 36. Consequently, by the rule of product there are 4 - 36 = 144 arrange-
                                ments of five E’s and 10 O’s that determine seven runs, the first run starting with E.
                                       The seven runs may also have the first run starting with an O and the last run ending
                                with an O. So now let w, count the number of O’s in the first run, w>2 the number of E’s in
                                the second run, w3 the number of O’s in the third run, .. . , and w7 the number of O’s in the
                                seventh run. Here we want the number of integer solutions for

w, + w3+ ws + w7 = 10,                Wy), W3, Ws, W7 > O

and

WwW. + ws + we = 5,        W2, W4, We > O.

Arguing as above, we find that the number of ways to arrange five E’s and 10 O’s, resulting
                                 in seven runs where the first run starts with an O, is (
                                                                                                 tre
                                                                                                       6
                                                                                                           Nets)
                                                                                                                          2)     = (6)(2) = 504.
                                    Consequently, by the rule of sum, the five E’s and 10 O’s can be arranged in 144 + 504 =
                                 648 ways to produce seven runs.

6. Answer Example 1.32, where the 12 symbols being trans-
                           (3 (eh         SR                               mitted are four A’s, four B’s, and four C’s.
                                                                            7. Determine the number of integer solutions of
  1. In how many ways can 10 (identical) dimes be distributed
among five children if (a) there are no restrictions? (b) each                                 Xp + Xp + x3 4+ X4 = 32,
child gets at least one dime? (c) the oldest child gets at least two       where
dimes?
                                                                               a)x,>0,       1<i<4                       b) x, >0,     I1<i<4
  2. In how many ways can 15 (identical) candy bars be dis-
tributed among five children so that the youngest gets only one                C) x1,
                                                                                   %2 25,        x3,X4 27
or two of them?                                                                d)x,>8,       1l<i<4                      e)x,>—2,        1<i<4
3. Determine how many ways 20 coins can be selected from                      f) x1, %2,%3>0,         O< x4 <25
four large containers filled with pennies, nickels, dimes, and              8. In how many ways can a teacher distribute eight chocolate
quarters. (Each container is filled with only one type of coin.)           donuts and seven jelly donuts among three student helpers if
4. Acertain ice cream store has 31 flavors of ice cream avail-            each helper wants at least one donut of each kind?
able. In how many ways can we order a dozen ice cream cones                 9. Columba has two dozen each of                   different colored beads.
if (a) we   do not want the same       flavor more   than once?   (b) a    If she can select 20 beads (with repetitions of colors allowed)
flavor may be ordered as many as 12 times? (c) a flavor may be             in 230,230 ways, what is the value of ?
ordered no more than 11 times?
                                                                           10. In how many ways can Lisa toss 100 (identical) dice so that
5. a) In how many ways can we select five coins from a col-               at least three of each type of face will be showing?
    lection of 10 consisting of one penny, one nickel, one dime,
                                                                           11. Two n-digit integers (leading zeros allowed) are considered
    one quarter, one half-dollar, and five (identical) Susan B.
                                                                           equivalent if one is a rearrangement of the other. (For example,
     Anthony dollars?
                                                                           12033, 20331, and 01332 are considered equivalent five-digit
     b) In how many ways can we select n objects from a col-               integers.) (a) How many five-digit integers are not equivalent?
     lection of size 2 that consists of n distinct and n identical         (b) If the digits 1, 3, and 7 can appear at most once, how many
     objects?                                                              nonequivalent five-digit integers are there?
                                                                                                                 1.4 Combinations with Repetition                       35

12. Determine the number of integer solutions for                                                   increment        :=0

X, $x.       +43
                                            +.x4 + x5 < 40,                                         sum    :=0
                                                                                                    for   i   :=1to10do
where
                                                                                                      for j :=1toido
    a)x,>0,                   I1<i<S                                                                    fork :=1tojdo
    b) x,
        > -3,                   1<i<5                                                                      begin
                                                                                                                  increment               :=    increment       +1
13. In how many ways can we distribute eight identical white
                                                                                                                  sum    :=    sum+            increment
balls into four distinct containers so that (a) no container is
                                                                                                               end
left empty? (b) the fourth container has an odd number of balls
in it?                                                                                   22. Consider the following program segment, where /, j, k,n,
14, a) Find the coefficient of v?w*xz in the expansion of                                and counter are integer variables and the value of n (a positive
    GBv+t2w+tx+ty+z)?.                                                                   integer) is set prior to this segment.
    b) How many                   distinct terms           arise in the expansion   in                    counter       :=0
    part (a)?                                                                                             for     i:=1tondo
15. In how many ways can Beth place 24 different books on                                                     for j :=ltoido
four shelves so that there is at least one book on each shelf? (For                                             fork :=1tojdo
any of these arrangements consider the books on each shelf to                                                        counter         :=        counter¢+1
be placed one next to the other, with the first book at the left of
the shelf.)                                                                              We shall determine, in two different ways, the number of times
                                                                                         the statement
16. For which positive integer n will the equations
                   (1)   X,    +xX2    +    x3    +---4    X19   =H,   and
                                                                                                               counter        :=    counter+1

(2) yityotyst--->+
                                    Yea =                                                is executed. (This is also the value of counter after execution
have the same number of positive integer solutions?                                      of the program segment.) From the result in Example 1.39, we
                                                                                         know that the statement is executed ("~}~ ') = ("}*) times.
17, How many ways are there to place 12 marbles of the same
                                                                                         For a fixed value of i, the for loops involving j and k result
size in five distinct jars if (a) the marbles are all black? (b) each
                                                                                         in (' 3 2) executions of the counter increment statement. Conse-
marble is a different color?
                                                                                         quently, ("37) = }°*_, ('4'). Use this result to obtain a sum-
18. a) How many nonnegative integer solutions are there                                  mation formula for
    to the pair of equations x) +x. +.%3+--:+x7 = 37,
    xX]   +   xX     +   x3       6?                                                                          P4P4P4-.-4W~ 5507.
                                                                                                                                                    i=]
    b) How many solutions in part (a) have x;, x, x; > 0?
                                                                                         23.   a)   Given positive integers m,n                with m > n, show that the
19. How many times is the print statement executed for the
                                                                                               number of ways to distribute m identical objects into n dis-
following program segment? (Here, i, /, k, and m are integer
                                                                                               tinct containers with no container left empty is
variables.)
                                                                                                              C(m—-—1,m—n)=C(m—1,n—-1).
                    for i       :=1to20
                                      do
                         forj :=1toido                                                         b) Show that the number of distributions in part (a) where
                           fork :=1tojdo                                                       each container holds at least r objects (m > nr) is
                                  form:=1tokdo                                                                   C(m—14+(1—r)a,n—-1).
                                       print          (i
                                                       * j)      + (kK
                                                                    * m)
                                                                                         24, Write a computer program (or develop an algorithm) to list
20. In the following program segment, i, 7, k, and counter are                           the integer solutions for
integer variables. Determine the value that the variable counter                               a) x; tx2+4%3=10,                   O<x,          1<i<3
will have after the segment is executed.                                                       b) x; +X. +%3 +24 = 4,                 -2<%,,              1<i<4

counter           :=        10                                      25. Consider the 2'? compositions of 20. (a) How many have
                     for         i:=1to1l15do                                            each summand          even? (b) How         many have each summand                  a
                         for j :=itoi15do                                                multiple of 4?
                           for k := 7 to 15 do
                                                                                         26.   Let n, m, k be positive integers with                 »n = mk.   How   many
                                  counter             :=   counter+1
                                                                                         compositions of # have each summand a multiple of k?
21. Find the value of sum after the given program segment is                             27, Frannie tosses a coin 12 times and gets five heads and seven
executed. (Here i, j, k, increment, and sum are integer vari-                            tails. In how many ways can these tosses result in (a) two runs
ables.)                                                                                  of heads and one run of tails; (b) three runs; (c) four runs;
36           Chapter 1   Fundamental Principles of Counting

(d) five runs; (e) six runs; and (f) equal numbers of runs of            b) For n > 6, how many strings of # 0’s and 1’s contain
heads and runs of tails?                                                 (exactly) three occurrences of 01?
28. a) Forn > 4, consider the strings made up of n bits — that           c) Provide a combinatorial proof for the following:
    is, a total of n 0’s and 1’s. In particular, consider those          Forn > 1,
    strings where there are (exactly) two occurrences of 01.                     n+          n+                (" + ‘),  n odd
    For example, if n = 6 we want to include strings such as              2" =      |    +     3    teeta,
    010010 and 100101, but not 101111 or 010101. How many                                      .                (rt tJ»  A even,
    such strings are there?

15
      The Catalan Numbers (Optional)
                               In this section a very prominent sequence of numbers is introduced. This sequence arises in
                               a wide variety of combinatorial situations. We'll begin by examining one specific instance
                               where it is found.

Let us start at the point (0, 0) in the xy-plane and consider two kinds of moves:
     EXAMPLE 1.42
                                                     R: (x, y) > (x + 1, y)        U: (x, y) > @, y+ 1).

We want to know how we can move from (0, 0) to (5, 5) using such moves — one unit to
                               the right or one unit up. So we’ ll need five R’s and five U’s. At this point we have a situation
                               like that in Example 1.14, so we know there are 10!/(5! 5!) = (12) such paths. But now
                              we ll add a twist! In going from (0, 0) to (5, 5) one may touch but never rise above the line
                               y = x. Consequently, we want to include paths such as those shown in parts (a) and (b) of
                               Fig. 1.9 but not the path shown in part (c).
                                   The first thing that is evident is that each such arrangement of five R’s and five U’s must
                               start with an R (and end with a U). Then as we move across this type of arrangement—
                               going from left to right — the number of R’s at any point must equal or exceed the number
                               of U’s. Note how this happens in parts (a) and (b) of Fig. 1.9 but not in part (c). Now             we
                               can solve the problem at hand if we can count the paths [like the one in part (c)] that go
                               from   (0, 0) to (5, 5) but rise above the line y = x. Look      again at the path in part (c) of
                               Fig. 1.9. Where does the situation there break down for the first time? After all, we start
                               with the requisite R — then follow it by a U. So far, so good! But then there is a second U
                               and, at this (first) time, the number of U’s exceeds the number of R’s.
                                   Now let us consider the following transformation:

R, U,U,      | U,R,R,R,U,U,R @ R,U,U,              | R,U,U,U,R,R,
                                                                                                           U.
                               What have we done here? For the path on the left-hand side of the transformation, we
                               located the first move (the second U) where the path rose above the line y = x. The moves
                               up to and including this move (the second U) remain as is, but the moves that follow are
                               interchanged   — each U is replaced by an R and each R by a U. The result is the path on
                               the right-hand side of the transformation — an arrangement of four R’s and six U’s, as seen
                               in part (d) of Fig. 1.9. Part (e) of that figure provides another path to be avoided; part (f)
                               shows what happens when this path is transformed by the method described above. Now
                               suppose we start with an arrangement of six U’s and four R’s, say

R, U,R,R, U, U, U, | U,U,R.
                                                                                                                                                                                      15 The Catalan Numbers (Optional)                                                                                                    37

y‘                                                   y       _ xf                  ;    ys                                                                y       = a7                |                       y'                                                                    y       - xf
      °                                                                (AIS. 5)                                                                                             a (a                                                                                                                      (7A (5, 5)
                                                               7                                                                                                    7                            |                                                                                            7
   4                                                   7                                     4                                                              7                                    |                 4
                                                   7                                                                                                    7                                                                                                                         7
                                               7                                                                                                    4                                                                                                                         4

3                                    f                                                 3                                              4
                                                                                                                                                                                                                   3                                              4

2
                                                                                                                                    ,                                                                                                                     o
                                                                                             2                              7                                                                                      2                              7
                                                                                                                       rs                                                                                                                    ov

1                 of                                                                    1                4                                                                                                    1                7
                    4                                                                                      7                                                                                                                     7
               ov                                                                                      f                                                                                                                     7
                                                                                » X                                                                                                  >» X                                                                                                                      ~   xX
                                       3                   4                5                                      1            2               3               4                5                                                       1            2               3                   4                5

R,U,R,R,U,R,R,U,U,U                                                                   R,R,U,U,R,U,R,R,U,U                                                                                                   R,U,U,U,R,R,R,U,U,R
  (a)                                                                                       (b)                                                                                                                   (c)

,
          y
                                                   (4, 6)                                                                                                                                            |                  y
                                                                                                                                                                                                                                                                                  (4, 6)
  6                                                                             Y=      x         y                                                                         y=x                                    6                                                                                                = xX

5                                                                yo
                                                                                <            5 t                                                                            yo
                                                                                                                                                                                     “
                                                                                                                                                                                     (5, 5)
                                                                                                                                                                                                                    5                                                                                 Yo
                                                                                                                                                                                                                                                                                                               o
                                                                   7                                                                                                    4                                                                                                                         4
      4                                            4
                                                       7                                     4                                                          /                                                |
                                                                                                                                                                                                                   4                                                              7
                                                                                                                                                                                                                                                                                      7
                                               7                                                                                                    4                                                    :                                                                    e
                                                                                                                                                                                                         ;          3                                                     /
      3                                    f                                                 3
                                                                                                                                            v                                                            |                                                        7
                                                                                                                                        4                                                                                                                     7
                                                                                                                                    Z                                                                               2                                     7
      2                                                                                      2
                                                                                                                            f                                                                                i                                    7

re                                                                                                                    Ye

1             4
                        7                                                                     1            4
                                                                                                               y                                                                                              |     1            7
                                                                                                                                                                                                                                     7
               7                                                                                       7                                                                                                                     7
                                                                                 > xX                                                                                                > X                                                                                                                       > X
                                       3                   4                5                                      1            2               3               4                5                                                       1            2               3                   4                5

R,U,U,R,U,U,U,R,R,U                                                                   U,U,R,U,R,R,R,U,R,U                                                                                                   U,R,U,R,U,U,U,R,U,R
  (d)                                                                                       (e)                                                                                                                   (f)

Figure 1.9

Focus on the first place where the number of U’s exceeds the number of R’s. Here it is in
                                       the seventh position, the location of the fourth U. This arrangement is now transformed
                                       as follows: The moves up to and including the fourth U remain as they are; the last three
                                       moves are interchanged   — each U is replaced by an R, each R by a U. This results in the
                                       arrangement

R, U,R,R,U, U,U,                                                      + R,R,U.
                                        —one of the bad arrangements (of five R’s and five U’s) we wish to avoid as we go from
                                       (0, 0) to (5, 5). The correspondence established by these transformations gives us a way
                                       to count the number of bad arrangements. We alternatively count the number of ways to
                                        arrange four R’s and six U’s — this is 10!/(4! 6!) = (/?). Consequently, the number of ways
                                       to go from (0, 0) to (5, 5) without rising above the line y = x is
                                                                                10\         /10\ _ 10! ~—- 10! 6(10)!
                                                                                                                    — 5(10)!
                                                                                 5           4)    515! 46!       615!
                                                                                                 -(;      doy    1 f/W0)_                                                                                                                                         2-5)
                                                                                                                       3) (as) -aen(s)                                                                                           sipls)>
38         Chapter 1 Fundamental Principles of Counting

The above result generalizes              as follows. For any integer n > 0, the number              of paths
                            (made up of n R’s and n U’s) going from (0, 0) to (m, m), without rising above the line
                            y =x, is
                                                       2n           2n           l       2n
                                               b, =           ~_            =                 .      n> 1,          bp
                                                                                                                     = 1.
                                                        n          n—]          n+l1\n

The numbers bo, b1, bo, .. .arecalled the Catalan numbers, after the Belgian mathematician
                            Eugéne Charles Catalan (1814-1894), who used them in determining the number of ways to
                            parenthesize the product x)x2x3x4 + - - x,. For instance, the five (= b3) ways to parenthesize
                            XyXOX3X4    ATE:

(((%1.%2)%3)
                                          x4)           (C01 (12.43) 4)     (C142) (03-¥4))       (x1 ((%2%3)x4))       (01 (%2(43.44)))-
                            The first seven Catalan numbers are bp = 1, b} = 1, b2 = 2, b3 = 5, by = 14, bs = 42, and
                            be = 132.

Here are some other situations where the Catalan numbers arise. Some of these examples
     EXAMPLE 1.43
                            are very much like the result in Example 1.42. A change in vocabulary is often the only
                            difference.

a) In how many ways can one arrange three 1’s and three —1’s so that all six partial
                                 sums (starting with the first summand) are nonnegative? There are five (= b3) such
                                 arrangements:
                                       1,1,1,-1,         -1, -1             1,1,-1, — -1,1,~1                   1,~—1,1,1,~—1,
                                                                                                                             ~-1
                                                                              ,1,~—1, ; 1 ,—1,-1
                                                                            1,1                                 1,-1,1, -1,1, -1
                                       In general, for n > 0, one can arrange n 1’s and n —1’s, with all 2n partial sums
                                   nonnegative, in b, ways.
                              b) Given four 1’s and four 0’s, there are 14 (= by) ways to list these eight symbols so
                                   that in each list the number of 0’s never exceeds the number of 1’s (as a list is read
                                   from left to right). The following shows these 14 lists:

10101010                11001010               11100010
                                                              10101100                11001100               11100100
                                                              10110010                11010010               11101000
                                                              10110100                11010100
                                                              10111000                11011000               11110000
                                   For n > 0, there are b,, such lists ofn 1’s and n 0’s.
                              c)                      Table 1.10

(((ab)c)d)               (((abc            111000
                                                            ((a(bc))d)               ({a(be            110100
                                                            ((ab)(cd))               ((ab(e            110010
                                                            (a({bc)d))               (a((be            101100
                                                            (a(b(cd)))               (a(b(c            101010

Consider the first column in Table 1.10. Here we find five ways to parenthesize the
                                   product abcd. The first of these is (((ab)c)d). Reading left to right, we list the three
                                   occurrences of the left parenthesis “(” and the letters a, b, c— maintaining the order
                                   in which these six symbols occur. This results in (((abc, the first expression in col-
                                                         1.5 The Catalan Numbers (Optional)     39

umn 2 of Table 1.10. Likewise, ((a(bc))d) in column 1 corresponds to ((a(be in col-
     umn 2—and so on, for the other three entries in each of columns | and 2. Now one
     can also go backward, from column 2 to column                |. Take an expression in column 2
     and append “d)” to the right end. For instance, ((ab(c becomes ((ab(cd). Reading
     this new expression from left to right, we now insert a right parenthesis “)”” whenever
     a product of two results arises. So, for example, ((ab(cd) becomes

((ab)(cd))
                                  For the      _t            tL   For the
                                  product of                      product of
                                  aand 6                          (ab) and (ca)

The correspondence between the entries in columns 2 and 3 is more immediate.
     For an entry in column 2 replace each “(’’ by a “1” and each letter by a “0”. Reversing
     this process, we replace each “1” by a “(”, the first 0 by a, the second by b, and the
     third by c. This takes us from the entries in column 3 to those in column 2.
         Now consider the correspondence between columns | and 3. (This correspondence
     arises from the correspondence between columns | and 2 and the one between columns
     2 and 3.) It shows us that the number of ways to parenthesize the product abcd equals
     the number of ways to list three 1’s and three 0’s so that, as such a list is read from left
     to right, the number of |’s always equals or exceeds the number of 0’s. The number
     of ways here is 5 (= 53).
        In general, one can parenthesize the product x;x2x3 -- +X, in b,_, ways.
  d) Let us arrange the integers 1, 2, 3, 4, 5, 6 in two rows of three so that (1) the integers
     increase in value as each row is read, from left to right, and (2) in any column the
     smaller integer is on top. For example, one way to do this is

1   2    4
                                                    3    5    6

Now consider three 1’s and three 0’s. Arrange these six symbols in a list so that
     the 1’s are in positions   1, 2, 4 (the top row) and the 0’s are in positions 3, 5, 6 (the
     bottom row). The result is 110100. Reversing the process, start with another list, say
     101100 (where the number of 0’s never exceeds the number of 1’s, as the list is read
     from left to right). The 1’s are in positions 1, 3, 4 and the 0’s are in positions 2, 5, 6.
     This corresponds to the arrangement

1   3    4
                                                    2    5    6

which satisfies conditions (1) and (2), as stated above. From this correspondence we
     learn that the number of ways to arrange 1, 2, 3, 4, 5, 6, so that conditions (1) and (2)
     are satisfied, is the number of ways to arrange three 1’s and three 0’s in a list so that
     as the six symbols are read, from left to right, the number of 0’s never exceeds the
     number of |’s. Consequently, one can arrange 1, 2, 3, 4,5, 6 and satisfy conditions (1)
     and (2) in b3 (= 5) ways.

In closing let us mention that the Catalan numbers will come up in other sections — in
particular, Section 5 of Chapter 10. Further examples can be found in reference [3] by
M. Gardner. For even more results about these numbers one should consult the references
for Chapter 10.
40             Chapter 1 Fundamental Principles of Counting

b) Find, as in Example 1.43, the way to parenthesize
                                                                                      abcdef that corresponds to each given list of five 1’s and
                                                                                      five 0’s.
1. Verify that for each integer n > 1,
                                                                                              i)   1110010100

Cr) (ea)
2. Determine the value of 57, bg, bo, and hyo.
                                                                                             ii)
                                                                                            iti)
                                                                                                   1100110010
                                                                                                   1011100100
                                                                                    9. Consider drawing n semicircles on and above a horizontal
3. a) In how many ways can one travel in the xy-plane from                       line, with no two semicircles intersecting. In parts (a) and (b)
    (0, 0) to (3, 3) using the moves R: (x, y) > (x + 1, y) and                   of Fig. 1.10 we find the two ways this can be done for n = 2;
    U: (x, y) > (x, y + 1), if the path taken may touch but                       the results for n = 3 are shown in parts (c)-(g).
    never fall below the line y = x? In how many ways from
    (0, 0) to (4, 4)?
     b) Generalize the results in part (a).
     c) What can one say about the first and last moves of the
     paths in parts (a) and (b)?
4. Consider the moves
     R: (x, y) >   («+1,y)        and         U:(x, y) >      (&, y+     1),
as in Example 1.42. In how many ways can one go
      a) from (0, 0) to (6, 6) and not rise above the line y = x?
     b) from (2, 1) to (7, 6) and not rise above the line y =
     x—1?
     c) from (3, 8) to (10, 15)          and not rise above            the line
     y=x+5?
  5. Find the other three ways to arrange 1, 2, 3, 4, 5, 6 in two
rows of three so that the conditions in part (d) of Example 1.43
are satisfied.
6. There are b, (= 14) ways to arrange |, 2,3,..., 8 in two
                                                                                             Figure 1.10
rows of four so that (1) the integers increase in value as each
row is read, from left to right, and (2) in any column the smaller
integer is on top. Find, as in part (d) of Example 1.43,                               i)   How many different drawings are there for four semi-
                                                                                            circles?
      a) the arrangements that correspond to each of the fol-
      lowing.                                                                         ii)    How many for any n > 0? Explain why.

i)   10110010          ii) 11001010               iii) 11101000          10. a) In how many ways can one go from (0, 0) to (7, 3) if
                                                                                      the only moves       permitted are R: (x, y) >   (x +   1, y) and
      b) the lists of four 1’s and four 0’s that correspond to each
                                                                                      U: (x, y) >      (x, y + 1), and the number of U’s may never
      of these arrangements of 1,      2,3,...,8.
                                                                                      exceed the number of R’s along the path taken?
         i    1345              ii) 1237               iii)
                                                         1 2 45
                                                                                      b) Let m, n be positive integers with m > n. Answer the
              2678                  4568                 3678
                                                                                      question posed in part (a), upon replacing 7 by m and 3
7. In how      many     ways   can     one    parenthesize     the    product        by an.
abcdef?
                                                                                  11. Twelve patrons, six each with a $5 bill and the other six
8. There are 132 ways          in which one can parenthesize the                 each with a $10 bill, are the first to arrive at a movie theater,
product abcdef g.                                                                 where the price of admission is five dollars. In how many ways
      a) Determine, as in part (c) of Example 1.43, the list of five              can these 12 individuals (all loners) line up so that the number
      1’s and five 0’s that corresponds to each of the following.                 with a $5 bill is never exceeded by the number with a $10 bill
           i) (((ab)c)(d(ef)))                                                    (and, as a result, the ticket seller is always able to make any
          ii) (a(b(e(d(ef)))))                                                    necessary change from the bills taken in from the first 11 of
         iii) ((((ab)(cd))e) f)                                                   these 12 patrons)?
                                                                  1.6 Summary and Historical Review            4]

1.6
Summary and Historical Review
               In this first chapter we introduced the fundamentals for counting combinations, permuta-
               tions, and arrangements in a large variety of problems. The breakdown of problems into
               components requiring the same or different formulas for their solutions provided a key
               insight into the areas of discrete and combinatorial mathematics. This is somewhat similar
               to the top-down approach for developing algorithms in a structured programming lan-
               guage. Here one develops the algorithm for the solution of a difficult problem by first
               considering major subproblems that need to be solved. These subproblems are then further
               refined — subdivided into more easily workable programming tasks. Each level of refine-
               ment improves on the clarity, precision, and thoroughness of the algorithm until it is readily
               translatable into the code of the programming language.
                   Table 1.11 summarizes the major counting formulas we have developed so far. Here
               we are dealing with a collection of n distinct objects. The formulas count the number of
               ways to select, or order, with or without repetitions, r of these n objects. The summaries of
               Chapters 5 and 9 include other such charts that evolve as we extend our investigations into
               other counting methods.

Table 1.11

Order Is | Repetitions                                                                   Location
             Relevant | Are Allowed |    Type of Result                          Formula               in Text

Yes           No         Permutation        Pin,r)      =ni/a—r)!,                    Page 7
                                                                          Q<r<n

Yes        Yes        Arrangement        n’,   n,r>0                               Page 7

n
                   No         No         Combination        C(n,r) =nl/[ri(n —r)!] = ( ),             Page 15
                                                                                       r
                                                                             O<r<n
                                                                        _]
                   No         Yes        Combination        (" rr            )       n,r >0           Page 27
                                         with repetition            "

As we continue to investigate further principles of enumeration, as well as discrete
               mathematical structures for applications in coding theory, enumeration, optimization, and
               sorting schemes in computer science, we shall rely on the fundamental ideas introduced in
               this chapter.
                   The notion of permutation can be found in the Hebrew work Sefer Yetzirah (The Book of
               Creation), a manuscript written by a mystic sometime between 200 and 600. However, even
               earlier, it is of interest to note that a result of Xenocrates of Chalcedon (396-314 B.C.) may
               possibly contain “the first attempt on record to solve a difficult problem in permutations
               and combinations.” For further details consult page 319 of the text by T. L. Heath [4],
               as well as page 113 of the article by N. L. Biggs [1], a valuable source on the history
               of enumeration. The first textbook dealing with some of the material we discussed in this
               chapter was Ars Conjectandi by the Swiss mathematician Jakob Bernoulli (1654—1705). The
               text was published posthumously in 1713 and contained a reprint of the first formal treatise
42   Chapter 1 Fundamental Principles of Counting

on probability. This treatise had been written in 1657 by Christiaan Huygens (1629-1695),
                      the Dutch physicist, mathematician, and astronomer who discovered the rings of Saturn.
                          The binomial theorem for n = 2 appears in the work of Euclid (300 B.C.), but it was not
                      until the sixteenth century that the term “binomial coefficient” was actually introduced by
                      Michel Stifel (1486—1567). In his Arithmetica Integra (1544) he gives the binomial coeffi-
                      cients up to the order of n = 17. Blaise Pascal (1623-1662), in his research on probability,
                      published in the 1650s a treatise dealing with the relationships among binomial coefficients,
                      combinations, and polynomials. These results were used by Jakob Bernoulli in proving the
                      general form of the binomial theorem in a manner analogous to that presented in this chap-
                      ter. Actual use of the symbol (") did not begin until the nineteenth century, when it was
                      used by Andreas von Ettinghausen (1796-1878).

Blaise Pascal (1623-1662)

It was not until the twentieth century, however, that the advent of the computer made
                      possible the systematic analysis of processes and algorithms used to generate permutations
                      and combinations. We shall examine one such algorithm in Section 10.1.
                         The first comprehensive textbook dealing with topics in combinations and permutations
                      was written by W. A. Whitworth [10]. Also dealing with the material of this chapter are
                      Chapter 2 of D. I. Cohen (2], Chapter 1 of C. L. Liu [5], Chapter 2 of F. S$. Roberts [6],
                      Chapter 4 of K. H. Rosen [7], Chapter 1 of H. J. Ryser [8], and Chapter 5 of A. Tucker [9].

REFERENCES
                            1. Biggs, Norman L. “The Roots of Combinatorics.” Historia Mathematica 6 (1979): pp. 109-
                               136.
                           2. Cohen, Daniel I. A. Basic Techniques of Combinatorial Theory. New York: Wiley, 1978.
                           3. Gardner, Martin. “Mathematical Games, Catalan Numbers: An Integer Sequence that Materi-
                              alizes in Unexpected Places.” Scientific American 234, no. 6 (June 1976): pp. 120-125.
                           4, Heath, Thomas Little. A History of Greek Mathematics, vol. 1. Reprint of the 1921 edition.
                                   New York: Dover Publications,     1981.
                             . Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
                           ON tA

. Roberts, Fred S. Applied Combinatorics. Englewood Cliffs, N.J.: Prentice-Hall, 1984.
                           7. Rosen, Kenneth H. Discrete Mathematics and Its Applications, 5th ed. New York: McGraw-
                                   Hill, 2003.
                            8. Ryser, H. J. Combinatorial Mathematics.           Published     by the Mathematical   Association   of
                               America. New York: Wiley, 1963.
                                                                                                     Supplementary Exercises               43

9. Tucker, Alan. Applied Combinatorics, 4th ed. New York: Wiley, 2002.
                                    10. Whitworth, W. A. Choice and Chance. Reprint of the 1901 edition. New York: Hafner, 1965.

b) the large blue plastic hexagonal block in exactly two
              SUPPLEMENTARY EXERCISES -                                    ways? (For example, the small red plastic hexagonal block
                                                                           is one such block.)
                                                                       10. Mr. and Mrs. Richardson want to name their new daughter
  1. In the manufacture of a certain type of automobile, four
                                                                       so that her initials (first, middle, and last) will be in alphabetical
kinds of major defects and seven kinds of minor defects can            order with no repeated initial. How many such triples of initials
occur. For those situations in which defects do occur, in how
                                                                       can occur under these circumstances?
many ways can there be twice as many minor defects as there
are major ones?                                                        11. In how many ways can the 11 identical horses on a carousel
2. A machine has nine different dials, each with five settings        be painted so that three are brown, three are white, and five are
labeled 0, 1, 2, 3, and 4.                                             black?
    a) In how many ways can all the dials on the machine be            12. In how many ways can a teacher distribute 12 different sci-
    set?                                                               ence books among 16 students if (a) no student gets more than
    b) If the nine dials are arranged in a line at the top of the      one book? (b) the oldest student gets two books but no other
    machine, how many of the machine settings have no two              student gets more than one book?
    adjacent dials with the same setting?
                                                                       13. Four numbers are selected from the following list of num-
  3. Twelve points are placed on the circumference of a circle         bers: —5, -4, —3, ~2, -1, 1, 2,3, 4. (a) In how many ways can
and all the chords connecting these points are drawn. What is          the selections be made so that the product of the four numbers
the largest number of points of intersection for these chords?         is positive and (i) the numbers are distinct? (ii) each number
4. Achoir director must select six hymns for a Sunday church          may be selected as many as four times? (iii) each number may
service. She has three hymn books, each containing 25 hymns            be selected at most three times? (b) Answer part (a) with the
(there are 75 different hymns in all). In how many ways can            product of the four numbers negative.
she select the hymns if she wishes to select (a) two hymns from
                                                                       14, Waterbury Hall, a university residence hall for men, is op-
each book? (b) at least one hymn from each book?
                                                                       erated under the supervision of Mr. Kelly. The residence has
  5. How many ways are there to place 25 different flags on            three floors, each of which is divided into four sections. This
10 numbered flagpoles if the order of the flags on a flagpole is       coming fall Mr. Kelly will have 12 resident assistants (one for
(a) not relevant? (b) relevant? (c) relevant and every flagpole        each of the 12 sections). Among these 12 assistants are the four
flies at least one flag?                                               senior assistants  — Mr. DiRocco, Mr. Fairbanks, Mr. Hyland,
  6. A penny is tossed 60 times yielding 45 heads and 15S tails.       and Mr. Thornhill. (The other eight assistants will be new this
In how many ways could this have happened so that there were           fall and are designated as junior assistants.) In how many ways
no consecutive tails?                                                  can Mr. Kelly assign his 12 assistants if

7. There are 12 men at a dance. (a) In how many ways can                   a) there are no restrictions?
eight of them be selected to form a cleanup crew? (b) How                  b) Mr. DiRocco and Mr. Fairbanks must both be assigned
many ways are there to pair off eight women at the dance with              to the first floor?
eight of these 12 men?                                                     c) Mr. Hyland and Mr. Thornhill must be assigned to dif-
8. In how many ways can the letters in WONDERING                 be       ferent floors?
arranged with exactly two consecutive vowels?                          15. a) How many of the 9000 four-digit integers 1000, 1001,
  9. Dustin has a set of 180 distinct blocks. Each of these blocks         1002, ... , 9998, 9999 have four distinct digits that are ei-
is made of either wood or plastic and comes in one of three sizes          ther increasing (as in 1347 and 6789) or decreasing (as in
(small, medium, large), five colors (red, white, blue, yellow,             6421 and 8653)?
green), and six shapes (triangular, square, rectangular, hexag-            b) How many of the 9000 four-digit integers 1000, 1001,
onal, octagonal, circular). How   many   of the blocks   in this set       1002, ..., 9998, 9999 have four digits that are either non-
differ from                                                                decreasing (as in 1347, 1226, and 7778) or nonincreasing
    a) the small red wooden square block in exactly one way?               (as in 6421, 6622, and 9888)?
    (For example, the small red plastic square block is one such       16. a) Find    the   coefficient   of x?yz?   in the    expansion    of
    block.)                                                                [(x/2) + y — 3zf.
44                          Chapter 1 Fundamental Principles of Counting

b) How many distinct terms are there in the complete ex-                                                    22. a) In how many ways can the letters in UNUSUAL be ar-
     pansion of                                                                                                      ranged?
                                                                                                                     b) For the arrangements in part (a), how many have all
                                                  5 +y—3z}3         °   y
                                                                         ?                                           three U’s together?
                                                                                                                     c) How many of the arrangements in part (a) have no con-
     c) What is the sum of all coefficients in the complete ex-
                                                                                                                     secutive U’s?
     pansion?
                                                                                                                 23. Francesca has 20 different books but the shelf in her dor-
17. a) In how many ways can 10 people, denoted A, B,...,
                                                                                                                 mitory residence will hold only 12 of them,
     I,   J,       be       seated        about     the    rectangular            table     shown           in
     Fig. 1.11, where Figs. 1.11(a) and 1.11(b) are considered                                                       a) In how many ways can Francesca line up 12 of these
     the same but are considered different from Fig. 1.11(c)?                                                        books on her bookshelf?

b) In how many of the arrangements of part (a) arc A and B                                                      b) How many of the arrangements                                in part (a) include
     seated on longer sides of the table across from each other?                                                     Francesca’s three books on tennis?

18. a) Determine the number of nonnegative integer solutions                                                     24. Determine the value of the integer variable counter after
    to the pair of equations                                                                                     execution of the following program segment. (Here i, /, k, /,
                                                                                                                 m, and n are integer variables. The variables r, s, and ¢ are
               X) +X. + x3 = 6,                           XptxX2        ber       tx        =         15,        also integer variables; their values— where r > 1, s > 5, and
                                           x,
                                            > 0,           l1<is<5.                                              t > 7 — have been set prior to this segment.)

b) Answer part (a) with the pair of equations replaced by                                                                 counter                    := 10
     the pair of inequalities                                                                                                  for          i    :=1tol12do
                                                                                                                                           for j :=1ltordo
               Xx; + x2 + x3 <6,                          xy +X. +--+
                                                                   + x5 < 15,
                                                                                                                                                counter           := counter + 2
                                           x, 20,          1<i<5.                                                              fork              :=5tosdo
19. For any given set in a tennis tournament, opponent A can                                                                           for 1 :=3                 tok do
beat opponent B in seven different ways. (At 6-6 they play a                                                                                    counter           :=    counter           +4
tie breaker.) The first opponent to win three sets wins the tour-                                                              for          m    := 3 to          12    do
nament. (a) In how many ways can scores be recorded with                                                                                   counter          :=        counter       +6
A winning in five sets? (b) In how many ways can scores be                                                                     for          n    :=       t downto           7 do

recorded with the tournament requiring at least four sets?                                                                             counter              := counter              + 8

20. Given n distinct objects, determine in how many ways r of
                                                                                                                 25. a) Find the number of ways to write 17 as a sum of 1’s and
these objects can be arranged in a circle, where arrangements
                                                                                                                     2’s if order is relevant.
are considered the same if one can be obtained from the other
by rotation.                                                                                                         b) Answer part (a) for 18 in place of 17.

21. For every positive integer n, show that                                                                          ¢) Generalize the results in parts (a) and (b) for # odd and
                                                                                                                     for m even.
     n         4        #     +      fn        des         nh   4            hh   +    Ht       a

0                 2            4                      ]                3         5

A         B                                            F   G                                                J

J                                      C             E                        H         H                                           A

D         D                                      G                                           B

H                                      E         C                            J             F                                       C

G         F                                           B    A                                    E           D

(a)                                                  (b)                                (c)

Figure 1.11
                                                                                                      Supplementary Exercises            45

26.   a) In how many ways can 17 be written as a sum of 2’s             bers will each select one of the candidates to be the winner and
      and 3’s if the order of the summands is (i) not relevant?         place his or her choice (checked off on a ballot) into the bal-
      (ii) relevant?                                                    lot box. Suppose that Katalin receives nine votes and Donna
      b) Answer part (a) for 18 in place of 17.                         receives five. In how many ways can the ballots be selected,
                                                                        one at a time, from the ballot box so that there are always more
27, a) If m and r are positive integers with n > r, how many
                                                                        votes in favor of Katalin? [This is a special case of a general
      solutions are there to
                                                                        problem called, appropriately, the ballot problem. This problem
                           Xp tXg brs         tty = A,                  was solved by Joseph Louis Fran¢gois Bertrand (1822-1900).]
                                                                        31. Consider the 8 X 5 grid shown in Fig. 1.13. How many
      where each x, is a positive integer, for 1 <i <r?
                                                                        different rectangles (with integer-coordinate corners) does this
      b) In how many ways can a positive integer 7 be written           grid contain? [For example, there is a rectangle (square) with
      as a sum of r positive integer summands (1 <r <n) if the          corners (1, 1), (2, 1), (2, 2), (1, 2), asecond rectangle with cor-
      order of the summands is relevant?                                ners (3, 2), (4, 2), (4, 4), (3, 4), anda third with corners (5, 0),
28. a) In how many ways can one travel in the x y-plane from            (7, 0), (7, 3) (S, 3).
    (1, 2) to (5, 9) if each move is one of the following types:
                                                                                    y
       (R): @, y) > @+I1,y)5                  (): @&y) > @ y+?
      b) Answer part (a) if a third (diagonal) move                            5                  e           ot

(D): (x, y) >       (x + Ieyt]))                      4

is also possible.                                                        3

29, a) In how many ways can a particle move in the xy-plane                    2
    from the origin to the point (7, 4) if the moves that are
      allowed are of the form:                                                  1

(R): (x, y) > (+1, yy);                (O): &, y) > (x, y+ 1)?                                                           L.,
                                                                                        1    2    3      4    5    6     7      8
      b) How many of the paths in part (a) do not use the path                 Figure 1.13
      from (2, 2) to (3, 2) to (4, 2) to (4, 3) shown in Fig. 1.12?
      c) Answer parts (a) and (b) if a third type of move               32. As head of quality control, Silvia examined 15 motors, one
                                                                        at a time, and found six defective (D) motors and nine in good
                         (D): x, y) > @t+ly+))
                                                                        (G) working condition. If she listed each finding (of D or G) af-
      is also allowed.                                                  ter examining each individual motor, in how many ways could
                                                                        Silvia’s list start with a run of three G’s and have six runs in
               y                                                        total?
               4

|                  |                  33. In order to graduate on schedule, Hunter must take (and
                                                                        pass) four mathematics electives during his final six quarters. If
                                                                        he may select these electives from a list of 12 (that are offered
                                                                        every quarter) and he does not want to take more than one of
                                                                        these electives in any given quarter, in how many ways can he
                                                                        select and schedule these four electives?
                                                                        34. In how many ways can a family of four (mother, father,
                     1      2     3      4      5    6      7           and two children) be seated at a round table, with eight other
             Figure 1.12                                                people, so that the parents are seated next to each other and
                                                                        there is one child on a side of each parent? (Two seatings are
30. Due to their outstanding academic records, Donna and                considered the same if one can be rotated to look like the other.)
Katalin are the finalists for the outstanding physics student (in
their college graduating class). Acommittee of 14 faculty mem-
  Fundamentals
         of Logic

I:
                       the first chapter we derived a summation formula in Example 1.40 (Section 1.4). We
                     obtained this formula by counting the same collection of objects (the statements that were
                  executed in a certain program segment) in two different ways and then equating the results.
                  Consequently, we say that the formula was established by a combinatorial proof. This is
                  one of many different techniques for arriving at a proof.
                      In this chapter we take a close look at what constitutes a valid argument and a more
                  conventional proof. When a mathematician wishes to provide a proof for a given situation,
                  he or she must use a system of logic. This is also true when a computer scientist develops
                  the algorithms needed for a program or system of programs. The logic of mathematics is
                  applied to decide whether one statement follows from, or is a logical consequence of, one
                  or more other statements.
                      Some of the rules that govern this process are described in this chapter. We shall use these
                  rules in proofs (provided in the text and required in the exercises) throughout subsequent
                  chapters. However, at no time can we hope to arrive at a point at which we can apply the
                  rules in an automatic fashion. As in applying the counting ideas discussed in Chapter 1,
                  we should always analyze and seek to understand the situation given. This often calls for
                  attributes we cannot learn in a book, such as insight and creativity. Merely trying to apply
                  formulas or invoke rules will not get us very far either in proving results (such as theorems)
                  or in doing enumeration problems.

2.1
Basic Connectives and Truth Tables
                  In the development of any mathematical theory, assertions are made in the form of sen-
                  tences. Such verbal or written assertions, called statements (or propositions), are declarative
                  sentences that are either true or false — but not both. For example, the following are state-
                  ments, and we use the lowercase letters of the alphabet (such as p, g, andr) to represent
                  these statements.

p:   Combinatorics is a required course for sophomores.
                                   gq:   Margaret Mitchell wrote Gone with the Wind.
                                   r:    24+3=5.

47
48   Chapter 2 Fundamentals of Logic

On the other hand, we do not regard sentences such as the exclamation

“What a beautiful evening!”

or the command

‘Get up and do your exercises.”
                     as Statements since they do not have truth values (true or false).
                        The preceding statements represented by the letters p, g, and r are considered to be
                     primitive statements, for there is really no way to break them down into anything simpler.
                     New statements can be obtained from existing ones in two ways.

1) Transform a given statement p into the statement —p, which denotes its negation and
                            is read “Not p.”
                                For the statement p above, —p is the statement “Combinatorics is not a required
                            course for sophomores.” (We do not consider the negation of a primitive statement
                            to be a primitive statement.)
                         2) Combine two or more statements into a compound              statement, using the following
                            logical connectives.
                             a) Conjunction: The conjunction of the statements p, g is denoted by p A qg, which
                                is read “p and g.” In our example the compound statement p A gq is read “Combi-
                                natorics is a required course for sophomores, and Margaret Mitchell wrote Gone
                                with the Wind.”
                             b) Disjunction: The expression p V g denotes the disjunction of the statements p, g
                                and is read “p or g.”’ Hence “Combinatorics is a required course for sophomores,
                                or Margaret Mitchell wrote Gone with the Wind” is the verbal translation for
                                pq, when p, q are as above. We use the word “or” in the inclusive sense here.
                                Consequently, p V g is true if one or the other of p, g is true or if both of the
                                statements p, q are true. In English we sometimes write “and/or” to point this out.
                                The exclusive “or” is denoted by p VY g. The compound statement p Y gq is true if
                                one or the other of p, g is true but not both of the statements p, g are true. One
                                way to express p Y gq for the example here is “Combinatorics is a required course
                                for sophomores, or Margaret Mitchell wrote Gone with the Wind, but not both.”
                             c) Implication: We say that “‘p implies g” and write p — gq to designate the statement,
                                which is the implication of gq by p. Alternatively, we can also say
                                   (i) “If p, then g.”                            (ii) “p is sufficient for g.”
                                 (iii) “‘p is a sufficient condition for q.”     (iv) “q is necessary for p.”
                                  (v) “g is anecessary condition for p.”         (vi) “p only if g.”
                                A verbal translation of p — gq for our example is “If combinatorics is a required
                                course for sophomores, then Margaret Mitchell wrote Gone with the Wind.” The
                                 statement p is called the hypothesis of the implication; g is called the conclu-
                                sion, When statements are combined in this manner, there need not be any causal
                                relationship between the statements for the implication to be true.
                             d) Biconditional: Last, the biconditional of two statements p,q,is denoted by p <> q,
                                which is read “p if and only if g,” or “p is necessary and sufficient for g.”’ For
                                our p, g, “Combinatorics is a required course for sophomores if and only if
                                Margaret Mitchell wrote Gone with the Wind” conveys the meaning of p = q.
                                We sometimes abbreviate “p if and only if g” as “p iff q.”

Throughout our discussion on logic we must realize that a sentence such as

“The number x is an integer.”
                                                                     2.1   Basic Connectives and Truth Tables        49

is not a statement because its truth value (true or false) cannot be determined until a nu-
              merical value is assigned for x. If x were assigned the value 7, the result would be a true
              statement. Assigning x a value such as 4, /2, or 2, however, would make the resulting
              statement false. (We shall encounter this type of situation again in Sections 2.4 and 2.5 of
              this chapter.)

In the foregoing discussion, we mentioned the circumstances under which the compound
              statements p V q, p Y g are considered true, on the basis of the truth of their components
              p,q. This idea of the truth or falsity of a compound statement being dependent only on the
              truth values of its components is worth further investigation. Tables 2.1 and 2.2 summarize
              the truth and falsity of the negation and the different kinds of compound statements on the
              basis of the truth values of their components. In constructing such truth tables, we write
              “0” for false and “1” for true.

Table 2.1               Table 2.2

p|    7p                   P|@9|PAQ|                PYG | p“q | p>qg|               peg
                       0         1                010           0           0          0          1             1
                       1         0                0}     1      0            1         1          l             0
                                                  1 | 0         0            1         1          0             0
                                                  1] 1           1           1         0           1             1

The four possible truth assignments for p, g can be listed in any order. For later work,
              the particular order presented here will prove useful.
                  We see that the columns of truth values for p and —p are the opposite of each other. The
              statement p A q is true only when both p, qg are true, whereas p V q is false only when both
              the component statements p, g are false. As we noted before, p Y q is true when exactly
              one of p, g is true.
                  For the implication p — q, the result is true in all cases except where p is true and g
              is false. We do not want a true statement to lead us into believing something that is false.
              However, we regard as true a statement such as “If 2 + 3 = 6, then 2 + 4 = 7,” even though
              the statements “2 + 3 = 6” and “2 + 4 = 7” are both false.
                  Finally, the biconditional p < q is true when the statements p, g have the same truth
              value and is false otherwise.

Now that we have been introduced to certain concepts, let us investigate a little further
              some of these initial ideas about connectives. Our first two examples should prove useful
              for such an investigation.

Let s, f, and u denote the following primitive statements:
EXAMPLE 2.1
                                            s:         Phyllis goes out for a walk.
                                            t:         The moon is out.
                                            us:        Itis snowing.

The following English sentences provide possible translations for the given (symbolic)
              compound statements.
                 a) (tf A -u) — s: If the moon is out and it is not snowing, then Phyllis goes out for a
                    walk,
50        Chapter 2 Fundamentals of Logic

b) t > (-u — s): If the moon is out, then if it is not snowing Phyllis goes out for a
                                 walk. [So ~u —> s is understood to mean (—u) —> s as opposed to —(u >             s).]
                              c) -(s @ (u V £)): It is not the case that Phyllis goes out for a walk if and only if it is
                                 snowing or the moon is out.

Now we will work in reverse order and examine the logical (or symbolic) notation for
                          three given English sentences:

d) “Phyllis will go out walking if and only if the moon is out.” Here the words “if
                                 and only if” indicate that we are dealing with a biconditional. In symbolic form this
                                 becomes s <> f.
                              e) “If it is snowing and the moon is not out, then Phyllis will not go out for a walk.”
                                 This compound statement is an implication where the hypothesis is also a compound
                                 statement. One may express this statement in symbolic form as (u A -t) > -s.
                              f) “It is snowing but Phyllis will still go out for a walk.” Now we come across a new
                                 connective  — namely, but. In our study of logic we shall follow the convention that
                                 the connectives but and and convey the same meaning. Consequently, this sentence
                                 may be represented as u A s.

Now let us return to the results in Table 2.2, particularly the sixth column. For if this is
                           one’s first encounter with the truth table for the implication p — q, then it may be somewhat
                           difficult to accept the stated entries — especially the results in the first two rows (where p has
                           the truth value 0). The following example should help make these truth value assignments
                           easier to grasp.

Consider the following scenario. It is almost the week before Christmas and Penny will be
     EXAMPLE 2.2
                           attending several parties that week. Ever conscious of her weight, she plans not to weigh
                           herself until the day after Christmas. Considering what those parties may do to her waistline
                           by then, she makes the following resolution for the December 26 outcome: “If I weigh more
                           than 120 pounds, then I shall enroll in an exercise class.”
                               Here we let p and g denote the (primitive) statements

p:   weigh more than 120 pounds.
                                                       q:   Ishall enroll in an exercise class.

Then Penny’s statement (implication) is given by p > q.
                              We shall consider the truth values of this particular example of p — q for the rows of
                           Table 2.2. Consider first the easier cases in rows 4 and 3.

@ Row 4: p and g both have the truth value 1. On December 26 Penny finds that she
                              weighs more than 120 pounds and promptly enrolls in an exercise class, just as she said
                              she would. Here we consider p — q to be true and assign it the truth value 1.
                              ® Row 3: p has the truth value 1, g has the truth value 0. Now that December 26 has
                              arrived, Penny finds her weight to be over 120 pounds, but she makes no attempt to enroll
                              in an exercise class. In this case we feel that Penny has broken her resolution — in other
                              words, the implication p — gq is false (and has the truth value 0).
                              The cases in rows | and 2 may not immediately agree with our intuition, but the example
                           should make these results a little easier to accept.
                                                              2.1    Basic Connectives and Truth Tables             51

@ Row |: p and g both have the truth value 0. Here Penny finds that on December 26
                 her weight is 120 pounds or less and she does not enroll in an exercise class. She has not
                 violated her resolution; we take her statement p — g to be true and assign it the truth
                 value 1.
                 @ Row 2: p has the truth value 0, g has the truth value 1. This last case finds Penny
                 weighing 120 pounds or less on December 26 but still enrolling in an exercise class.
                 Perhaps her weight is 119 or 120 pounds and she feels this is still too high. Or maybe
                 she wants to join an exercise class because she thinks it will be good for her health. No
                 matter what the reason, she has not gone against her resolution p — gq. Once again, we
                 accept this compound statement as true, assigning it the truth value 1.

Our next example    discusses   a related notion:    the decision    (or selection)      structure in
              computer programming.

In computer science the if-then and if-then-else decision structures arise (in various for-
EXAMPLE 2.3
              mats) in high-level programming languages such as Java and C++. The hypothesis p is often
              a relational expression such as x > 2. This expression then becomes a (logical) statement
              that has the truth value 0 or 1, depending on the value of the variable x at that point in
              the program. The conclusion g is usually an “executable statement.” (So g is not one of
              the logical statements that we have been discussing.) When dealing with “if p then g,” in
              this context, the computer executes g only on the condition that p is true. For p false, the
              computer goes to the next instruction in the program sequence. For the decision structure
              “if p then g else 7,” g is executed when p is true and r is executed when p is false.

Before continuing, a word of caution: Be careful when using the symbols > and @ . The
              implication and the biconditional are not the same, as evidenced by the last two columns
              of Table 2.2.
                 In our everyday language, however, we often find situations where an implication is used
              when the intention actually calls for a biconditional. For example, consider the following
              implications that a certain parent might direct to his or her child.

s —t:     If you do your homework, then you will get to watch the baseball game.
                  t-—»s:    You will get to watch the baseball game only if you do your homework.

e Case |: The implication s — t. When the parent says to the child, “If you do your
                homework, then you will get to watch the baseball game,” he or she is trying a positive
                approach by emphasizing the enjoyment in watching the baseball game.
                © Case 2: The implication tf — s. Here we find the negative approach and the parent who
                warns the child in saying, “You will get to watch the baseball game only if you do your
                homework.” This parent places the emphasis on the punishment (lack of enjoyment) to
                be incurred.

In either case, the parent probably wants his or her implication — be its > t ort > s —
              to be understood as the biconditional s < ¢. For in case 1 the parent wants to hint at the
              punishment while promising the enjoyment; in case 2, where the punishment has been
              used (perhaps, to threaten), if the child does in fact do the homework, then that child will
              definitely be given the opportunity to enjoy watching the baseball game.
52        Chapter 2. Fundamentals of Logic

In scientific writing one must make every effort to be unambiguous       — when an im-
                           plication is given, it ordinarily cannot, and should not, be interpreted as a biconditional.
                           Definitions are a notable exception, which we shall discuss in Section 2.5.

Before we continue let us take a step back. When we summarized the material that
                           gave us Tables 2.1 and 2.2, we may not have stressed enough that the results were for any
                           statements p, ¢ — not just primitive statements p, g. Examples 2.4 through 2.6 should help
                           to reinforce this.

Let us examine the truth table for the compound statement “Margaret Mitchell wrote Gone
     EXAMPLE 2.4
                           with the Wind, and if 2 + 3 # 5, then combinatorics is a required course for sophomores.”
                           In symbolic notation this statement is written as g A (~r > p), where p,q, andr represent
                           the primitive statements introduced at the start of this section. The last column of Table 2.3
                           contains the truth values for this result. We obtained these truth values by using the fact
                           that the conjunction of any two statements is true if and only if both statements are true.
                           This is what we said earlier in Table 2.2, and now              one of our statements
                                                                                                            — namely,   the
                           implication —r — p— is definitely a compound statement, not a primitive one. Columns
                           4, 5, and 6 in this table show how we build the truth table up by considering smaller parts
                           of the compound statement and by using the results from Tables 2.1 and 2.2.

Table 2.3

P\|@ir|~w|-wrs>p |] qa(-ra p)
                                                         0|0;,0]         1            0          0
                                                         0;0;11          0             ]         0
                                                         QO};  1   ]0}   1            0          0
                                                         O;1}]1)]        0             ]         1
                                                          1/0;     07    1            ]          0
                                                          1/0};1]        0            ]          0
                                                          1/1/07          1            1         ]
                                                          1;   1}]1 7]   0            1          1

In Table 2.4 we develop the truth tables for the compound statements p Vv (g Ar) (col-
     EXAMPLE 2.5
                           umn 5) and (p v g) Ar (column 7).

Table 2.4

P|\|q\|ri{qaar|                pv(@ar) | pv@!|        (pyqgar
                                               0|0]        0       0             0          0           0
                                               0; 0]  J            0             0          0           0
                                               0O/1!0              0             0          ]           0
                                               O|    1]  1         1              ]          ]          ]
                                                1/|0};,0           0              l          1          0
                                                1/0]     1         0              1          ]          1
                                               1/1)0               0              ]          ]          6)
                                               1/141                l             1          1          1
                                                                       2.1       Basic Connectives and Truth Tables      53

Because the truth values in columns 5 and 7 differ (in rows 5 and 7), we must avoid
                 writing a compound statement such as p V q A r. Without parentheses to indicate which of
                 the connectives V and A should be applied first, we have no idea whether we are dealing
                 with  p V (¢g Ar) or(pVq)Ar.

Our last example for this section illustrates two special types of statements.

The results in columns 4 and 7 of Table 2.5 reveal that the statement p > (p V q) is true and
EXAMPLE 2.6
                 that the statement p A (—p A q) is false for all truth value assignments for the component
                 statements p, q.

Table 2.5

P|@|       PVG | p>(pvq@) | 7p | 7pAg | pA(a=pag)
                                 0/0           0              l              1            0             0
                                 0}     1      1              1              1            1             0
                                  110          1              1              0            0             0
                                  1     1      ]              1              0            0             0

Definition 2.1   A compound statement is called a tautology if it is true for all truth value assignments for
                 its component statements. If a compound statement is false for all such assignments, then
                 it is called a contradiction.

Throughout this chapter we shall use the symbol 7p to denote any tautology and the
                 symbol Fo to denote any contradiction.
                     We can use the ideas of tautology and implication to describe what we mean bya valid
                 argument. This will be of primary interest to us in Section 2.3, and it will help us develop
                 needed skills for proving mathematical theorems. In general, an argument starts with a list
                 of given statements called premises and a statement called the conclusion of the argument.
                 We examine these premises, say P|, P2, P3,--.. Pn, and try to show that the conclusion
                 q follows logically from these given statements   — that is, we try to show that if each of
                 Pi, P2, P3,--+, Pn iS a true statement, then the statement gq is also true. To do so one way
                 is to examine the implication

(pi A po A pa A+++             pa)’ > 4,
                 where the hypothesisis the conjunction of then premises. Ifany one of py, p2, P3,..., Pais
                 false, then no matter what truth valueg has, the implication (p) A p2 A p3 A---A Pn) >
                 is true. Consequently, if we start with the premises p), p2, P3,..., Py —each with truth
                 value 1 — and find that under these circumstances gq also has the value 1, then the implication

(pi   A   p2A   p3A-°-A          Py)   >   @

is a tautology and we have a valid argument.

Tat this point we have dealt only with the conjunction of two statements, so we must point out that the
                 conjunction py A p2 A p3A--+A p, of n statements is true if and only if each p,, 1 <i <n, is true. We shall
                 deal with this generalized conjunction in detail in Example 4.16 of Section 4.2.
54             Chapter 2. Fundamentals of Logic

7. Rewrite each of the following statements as an implication
                            EXERCISES 2.1                             in the if-then form.
                                                                              a) Practicing her serve daily is a sufficient condition for
1, Determine whether each of the following sentences is a
                                                                              Darci to have a good chance of winning the tennis tourna-
statement.
                                                                              ment.
     a) In 2003 George W. Bush was the president of the United
                                                                              b) Fix my air conditioner or [ won’t pay the rent.
     States.
                                                                              c) Mary will be allowed on Larry’s motorcycle only if she
     b) x + 3 is a positive integer.
                                                                              wears her helmet.
      c) Fifteen is an even number.
                                                                       8. Construct a truth table for each of the following compound
     d) If Jennifer is late for the party, then her cousin Zachary    statements, where p, g, r denote primitive statements.
     will be quite angry.
                                                                              a) -(pV 7q) > 7p                        b) p> (qr)
      e) What time is it?
                                                                              ©) (p>qgor                              d) (p>
                                                                                                                          gq) > (¢> p)
     f) As of June 30, 2003, Christine Marie Evert had won the
     French Open a record seven times.                                        e) [PA(p>@]>               4            f) (pAq)>Pp
2. Identify the primitive statements in Exercise 1.                          8) 9 > (-pVv 79)
3. Let p, g be primitive statements for which the implication
                                                                              h) (p> g)A@G>r)l> (pr)
p — gq is false. Determine the truth values for each of the fol-       9. Which of the compound                     statements   in Exercise   8 are
lowing.                                                               tautologies?
                                                                      10. Verify that [p >         (q>rn)>-lpoqgdoeworj)jisa
     a) pAq          b)   ~pVq          c)q>p         d)   -g-—-
                                                               7p
                                                                      tautology.
4, Let p, g, r, s denote the following statements:
                                                                      11. a) How many rows are needed for the truth table of the
     p:     I finish writing my computer program before lunch.            compound statement (p V 7g) © [(—r As) > ft], where
     gq:   Ishall play tennis in the afternoon.                           p.g,¥, 8, and ¢ are primitive statements?
     r:    The sun is shining.
                                                                              b) Let p;, p2,..., Pp, denote n primitive statements. Let
     s:    The humidity is low.
                                                                              p be a compound statement that contains at least one oc-
Write the following in symbolic form.                                         currence   each   of p,,       for   1 <i   <n—and     p contains   no
                                                                              other primitive statement. How many rows are needed to
      a) If the sun is shining, I shall play tennis this afternoon.
                                                                              construct the truth table for p?
     b) Finishing the writing of my computer program before
                                                                      12. Determine all truth value assignments, if any, for the prim-
     lunch is necessary for my playing tennis this afternoon.
                                                                      itive statements p, g, r, s, t that make each of the following
     c) Low humidity and sunshine are sufficient for me to play       compound statements false.
     tennis this afternoon.
                                                                              a) [(p Ag) Ar] > (Vt)
5. Let p, g, r denote the following statements about a partic-
ular triangle ABC.                                                            b) [PA (@ Ar] > (s V2)
                                                                      13. If statement g has the truth value 1, determine all truth value
     p:    Triangle ABC     is isosceles.
                                                                      assignments for the primitive statements, p, 7, and s for which
     g:    Triangle ABC is equilateral,                               the truth value of the statement
     r:    Triangle ABC     is equiangular,
                                                                                      (q > (sp Vr) Ans) A [78 > (or Aq)
Translate each of the following into an English sentence.
                                                                      is 1.
      a)q->p                           b) ~p> —q                      14, At the start of a program (written in pseudocode) the inte-
      dgqer                            d) pA 74                       ger variable n is assigned the value 7. Determine the value of
      e)r—>p                                                          n after each of the following successive statements is encoun-
  6. Determine the truth value of each of the following impli-        tered during the execution of this program. [Here the value of
cations.                                                              n following the execution of the statement in part (a) becomes
                                                                      the value of » for the statement in part (b), and so on, through
      a) (f3+4=
           12, then3 +2 = 6.                                          the statement in part (d). For positive integers a, b, |a/b]| re-
      b) [f3 +3 =6, then3+4=9.                                        turns the integer part of the quotient— for example, [6/2] = 3,
      c) If Thomas Jefferson was the third president of the United     [7/2] = 3, [2/5] = 0, and [8/3] = 2.}
      States, then 2 + 3 = 5.                                                 a) ifn>S5thenn                  :=n+2
                                                                                  2.2 Logical Equivalence: The Laws of Logic          55

b) if   ((n+2=8)
                   or                  (n-3=6))        then                                 for     i:=l1toemdo
             n:=2*       n+l                                                                   for j :=1tondo
    c) if   ((n
              - 3         =16)    and     ([n/6]   =1))    then                                   if if j then
             n:=n4+3                                                                                 print i+j

d) if   ((n
              4 21)       and         (n-7=15))        then              How many times is the print statement in the segment exe-
             nm:=n-4                                                     cuted when (a) m = 10, n = 10; (b) m = 20, n = 20; (c) m =
15, The integer variables m and n are assigned the values 3              10, n = 20; (d)m = 20, n = 10?
and 8, respectively, during the execution of a program (written          17. After baking a pie for the two nieces and two nephews who
in pseudocode). Each of the following successive statements is           are visiting her, Aunt Nellie leaves the pie on her kitchen ta-
then encountered during program execution. [Here the values              ble to cool. Then she drives to the mall to close her boutique
of m, n following the execution of the statement in part (a) be-         for the day. Upon her return she finds that someone has eaten
come the values of m, n for the statement in part (b), and so on,        one-quarter of the pie. Since no one was in her house that day —
through the statement in part (e).] What are the values of m, n          except for the four visitors — Aunt Nellie questions each niece
after each of these statements is encountered?                           and nephew about who ate the piece of pie. The four “suspects”
    a) ifn-m=S5thenn                    :=n-2                            tell her the following:
    b) if   ((2*   m=n)          and     (|n/4|]=1))      then              Charles:   Kelly ate the piece of pie.
             n:=-4*m-3                                                      Dawn:      I did not eat the piece of pie.
    c) if   ((n< 8)      or      (|m/2]=2))        thenn:=2*m              Kelly:      Tyler ate the pie.
        elsem:=2*n                                                         Tyler:      Kelly lied when she said I ate the pie.

d) if   ((m<   20)    and         (|n/6]   =1))    then                  If only one of these four statements is true and only one of
             Mm:=m-n-5                                                   the four committed this heinous crime, who is the vile culprit
                                                                         that Aunt Nellie will have to punish severely?
    e) if   ((n=2*
                 m)              or    (|n/2|=5))         then
             m:=m+2

16. In the following program segment i, j, m, and n are integer
variables. The values of m and n are supplied by the user earlier
in the execution of the total program.

2.2
  Logical Equivalence: The Laws of Logic
                                      In all areas of mathematics we need to know when the entities we are studying are equal or
                                      essentially the same. For example, in arithmetic and algebra we know that two nonzero real
                                      numbers are equal when they have the same magnitude and algebraic sign. Hence, for two
                                      nonzero real numbers x, y, we have x = y if |x| = |y| and xy > O, and conversely (that is,
                                      if x = y, then |x| = |y| and xy > 0). When we deal with triangles in geometry, the notion
                                      of congruence arises. Here triangle ABC and triangle DE F are congruent if, for instance,
                                      they have equal corresponding sides     — that is, the length of side AB = the length of side
                                      DE, the length of side BC = the length of side E F, and the length of side CA = the length
                                      of side FD.
                                          Our study of logic is often referred to as the algebra of propositions (as opposed to the
                                      algebra of rea] numbers). In this algebra we shall use the truth tables of the statements,
                                      or propositions, to develop an idea of when two such entities are essentially the same. We
                                      begin with an example.

For primitive statements p and qg, Table 2.6 provides the truth tables for the compound
     EXAMPLE 2.7
                                      statements =p V q and p — q. Here we see that the corresponding truth tables for the two
                                      statements —p V g and p — q are exactly the same.
56          Chapter 2 Fundamentals of Logic

Table 2.6

“P|          TPY@ | P>              4

ls

Oreo|}

Oe —_
                                                                                           re
                                                                     OO

Ore
                                                                                           oo
                                                                     =r

—-

—-
                                                                                                     -
                                This situation leads us to the following idea.

Definition 2.2         Two statements 5), 52 are said to be logically equivalent, and we write 5; <> 52, when the
                            statement s is true (respectively, false) if and only if the statement s> is true (respectively,
                            false).

Note that when s; <> s2 the statements s; and s2 provide the same truth tables because
                            S51, S2 have the same truth values for all choices of truth values for their primitive compo-
                            nents.
                                As aresult of this concept we see that we can express the connective for the implication (of
                            primitive statements) in terms of negation and disjunction — that is, (p > g) <> —~p vq.
                            In the same manner, from the result in Table 2.7 we have (p — gq) =} (p> q) A(q > Pp),
                            and this helps validate the use of the term biconditional. Using the logical equivalence from
                            Table 2.6, we find that we can also write (p @ q) <=} (—p V q) A (-4 V p). Consequently,
                            if we so choose, we can eliminate the connectives — and < from compound statements.

Table 2.7

P|94\)p7q|\|qar>p|WoqgaAdgap) | peg
                                                     0|0             l                 1                       1                   l
                                                     0;     1        1                 0                       0                   0
                                                      1 | 0          0                 ]                       0                   0
                                                     1)   1          l                 1                       1                   1

Examining Table 2.8, we find that negation, along with the connectives A and Vv, are all
                            we need to replace the exclusive or connective, V. In fact, we may even eliminate either A
                            or V. However, for the related applications we want to study later in the text, we shall need
                            both A and V as well as negation.

Table 2.8

P|\|qd\p%@|                pvq | pag | 7(pAqg) | (pvgan(pag)
                                              0;     0          0         0                  0             1                   0
                                              0! 1              l         1                  0             1                   ]
                                               1/0              ]         1                  0             ]                   ]
                                              ]      1          0         1                     1          0                   0
                                                                   2.2   Logical Equivalence: The Laws of Logic        57

We now use the idea of logical equivalence to examine some of the important properties
                 that hold for the algebra of propositions.
                    For all real numbers a, b, we know that —(a + b) = (—a) + (—b). Is there acomparable
                 result for primitive statements p, q?

In Table 2.9 we have constructed the truth tables for the statements —(p Ag), —p V 7g,
EXAMPLE 2.8
                 —(p\Vq), and —p A —q, where p, g are primitive statements. Columns 4 and 7 reveal
                 that —(p Ag) <> —p V -q; columns 9 and 10 reveal that -(p V g) <> —p A -q. These
                 results are known as DeMorgan’s Laws. They are similar to the familiar law for real numbers,

—(a + b) = (—a) + (—B),
                 already noted, which shows the negative of a sum to be equal to the sum of the nega-
                 tives. Here, however, a crucial difference emerges: The negation of the conjunction of two
                 primitive statements p, g results in the disjunction of their negations —~p, -g, whereas
                 the negation of the disjunction of these same statements p, g is logically equivalent to the
                 conjunction of their negations =p, —q.

Table 2.9

pP\|q|paq | -~(paqg) | ~p | ~@ | ~pvVn7d | PYG | ~@Vvq | ~pAa-@
                     010         0           1        ]        l            ]           0            l             1
                     0}  1       0           1        1       0             l           1            0            0
                     1/0         0           1       0        ]             ]           1            0            )
                     1/1         |           0       0        0             0            1           0            0

Although p, g were primitive statements in the preceding example we shall soon learn
                 that DeMorgan’s Laws hold for any two arbitrary statements.

In the arithmetic of real numbers, the operations of addition and multiplication are both
                 involved in the principle called the Distributive Law of Multiplication over Addition: For
                 all real numbers a, b, c,

ax(b+ec)=(aXb)4+(aXc).

The next example shows that there is a similar law for primitive statements. There is also
                 a second related law (for primitive statements) that has no counterpart in the arithmetic of
                 real numbers.

EXAMPLE 2.9 _|   Table 2.10 contains the truth tables for the statements p A (q Vr), (pAgqg)V(pAP),
                 pV (q Ar), and (pV g) A(p vr). From the table it follows that for all primitive state-
                 ments p, g, andr,
                           DACGVT)     SS (PAQGV(PAr)                     The Distributive Law of A over v
                           PV (GAr)    SS    (PVQA(pyvr)                  The Distributive
                                                                                        Law of V overA

The second distributive law has no counterpart in the arithmetic of real numbers. That
                 is, it is not true for all real numbers a, b, and c that the following holds: a + (b X c) =
                 (a+b) X (a+c). For a=2, b=3, and c=5, for instance, a+ (bX c) = 17 but
                 (a+b) X (a+c) = 35.
58   Chapter 2. Fundamentals of Logic

Table 2.10

P\qiri|          pat@vr) | DAGV(pAr) | pV(@GAr) | PYQADVr)
                             0/0]      0            0                       0                      0                0
                             0/0}      1            0                       0                      0                0
                             0|   110               0                       0                      0                0
                             QO;  14 1              0                       0                       ]               l
                              1}0]0                 0                       0                       ]               1
                              1}     0/1            |                       ]                       ]               1
                              1}     140            ]                       1                       l               1
                              1;     1]1            ]                       ]                      ]                1

Before going any further, we note that, in general, if s,, s2 are statements and 5s; © s2
                      is a tautology, then s;, s2 must have the same corresponding truth values (that is, for each
                      assignment of truth values to the primitive statements in s; and sz, 5s; is true if and only
                      if sp is true and s, is false if and only if sz is false) and s; <> s2. When s; and s are
                      logically equivalent statements (that is, s; <> s2), then the compound statement 5; © 5s» is
                      a tautology. Under these circumstances it is also true that =s} <> —5>, and 7s, <> —5p is
                      a tautology.
                         If 5), s2, and s3 are statements where s; <> so and s2 <> s3 then s; <> 53. When two
                      statements s; and sz are not logically equivalent, we may write s; <4 s2 to designate this
                      situation.

Using the concepts of logical equivalence, tautology, and contradiction, we state the
                      following list of laws for the algebra of propositions.

The Laws of Logic
                        For any primitive statements p, g, r, any tautology Tp, and any contradiction Fo,
                          l)h-sp<p                                      .               Law of Double Negation
                          2) ~(p V gq) = =p A™G                                         DeMorgan’s Laws
                              —(p Ag) <=> =p Vv mg                                          .
                          3) pV¥Vae>qVvp                                                Commutative Laws
                              PAGS>qAp                                              :
                          4) pV(qvryes(pvgvr'                                       - Associative Laws
                             PAW Ar) <= (PAG
                                           Ar
                          5S) pV(gAr)e>(pVqg)A(pvr)                                 _ Distributive Laws
                              PAG Vr) = (PAG) V (PAT)                                   .
                          6) pVp<>p                         .                           idempotent Laws
                             PAP    p
                          7) pV Fo > p                                                  Identity Laws
                              PAT) <> p                                         :

‘We note that because of the Associative Laws, there is no ambiguity in statements of the form p Vv g V r or
                      PAQGAY.
                                                                  2.2 Logical Equivalence: The Laws of Logic        59

8) pV ~p <> Th                Inverse Laws
                             pAnp <> Fo
                          9) pV T) <> To                Domination Laws
                             PAI <= fy
                        10) pV (pAqg) => p             Absorption Laws
                             DA(PYV
                                q) => p

We now turn our attention to proving all of these properties. In so doing we realize that
                     we could simply construct the truth tables and compare the results for the corresponding
                     truth values in each case —as we did in Examples 2.8 and 2.9. However, before we start
                     writing, let us take one more look at this list of 19 laws, which, aside from the Law of
                     Double Negation, fall naturally into pairs. This pairing idea will help us after we examine
                     the following concept.

Definition 2.3   Let s be a statement. If s contains no logical connectives other than A and Vv, then the dual
                     of s, denoted s@, is the statement obtained from s by replacing each occurrence of A and Vv
                     by v and A, respectively, and each occurrence of Ty and Fo by Fo and 7p, respectively.

If p is any primitive statement, then p* is the same as p — that is, the dual of a primitive
                     statement is simply the same primitive statement. And (—p)? is the same as sp. The
                     statements p V —p and p A —p are duals of each other whenever p is primitive          — and so
                     are the statements p V Ty and p A Fo.
                        Given the primitive statements p,q, r and the compound statement

St     (pA7q)V (FAT),
                     we find that the dual of s is

sf:     (pV ag) A(rv Fo).

(Note that —g is unchanged as we go from s to s@.)
                         We now state and use a theorem without proving it. However, in Chapter 15 we shall
                     justify the result that appears here.

THEOREM 2.1          The Principle of Duality. Let s and t be statements that contain no logical connectives other
                     than A and v. Ifs <1, then s? <> #7,

As a result, laws 2 through 10 in our list can be established by proving one of the laws
                     in each pair and then invoking this principle.

We also find that it is possible to derive many other logical equivalences. For example,
                     if g, r, Ss are primitive statements, the results in columns 5 and 7 of Table 2.11 show us that

(,r¥AS)>~qG=r-(rAs)Vq

or that [7 As) ~     gq]   [-( A 5) Vq]     is a tautology. However,      instead of always   con-
                     structing more (and, unfortunately, larger) truth tables it might be a good idea to recall from
                     Example 2.7 that for primitive statements p, q, the compound statement

(p>oqge(pyvg)
60        Chapter 2. Fundamentals of Logic

Table 2.11

q\iri|is                 {ras |     (WAS)>@ | 7AVrAs) |               7A(ras) vg

0/0190                    0               1                ]                ]
                                               0/;0)           1         0               ]                1                1
                                               0/1190                    0               1                ]                 ]
                                               O;1]1                      1              0               0                 0
                                               1/0{90                    0               1                1                 l
                                               1/0)        1             0               1                1                1
                                               1]/1);90                  0               1                ]                ]
                                               1/1)        1             1               I               0                 1

is a tautology. If we were to replace each occurrence of this primitive statement p by the
                           compound statement r A s, then we would obtain the earlier tautology

[rAs)>~@q]leo[-(As)                       vq].

What has happened here illustrates the first of the following two substitution rules:

1) Suppose that the compound statement P is a tautology. If p is a primitive statement
                                  that appears in P and we replace each occurrence of p by the same statement q, then
                                  the resulting compound statement P, is also a tautology.
                              2) Let P be a compound statement where p is an arbitrary statement that appears in
                                 P, and let g be a statement such that g <> p. Suppose that in P we replace one or
                                 more occurrences of p by g. Then this replacement yields the compound statement
                                 P,. Under these circumstances P,; <> P.

These rules are further illustrated in the following two examples.

EXAMPLE   2.10           a) From the first of DeMorgan’s                      Laws we know that for all primitive statements p, g,
                :                the compound statement

P:     >(pvq)        << (-pa-q)
                                 is a tautology. When we replace each occurrence of p by r A 5, it follows from the
                                 first substitution rule that

Py         -AlrAs)        Vg]   eo[-r      As) Ang]

is also a tautology. Extending this result one step further, we may replace each occur-
                                 rence of g by t > u. The same substitution rule now yields the tautology

Py           aAlrAs)Vit>oulol[A(raAs)A-G                              > uv),

and hence, by the remarks following shortly after Example 2.9, the logical equivalence

“[(rAs)Vitaw]esl[A(r                         as) Ant        >        u)].

b) For primitive statements p, g, we learn from the last column of Table 2.12 that the
                                compound statement [p A (p — g)]— q isa tautology. Consequently, ifr, s, t, u are
                                any statements, then by the first substitution rule we obtain the new tautology

lr >          s)A[4       >   5) >      (+t Vu)]]    >    (Ct vu)

when we replace each occurrence of p by r > s and each occurrence of g by -t V u.
                                                                    2.2 Logical Equivalence: The Laws of Logic                     61

Table 2.12

P\|@|Pp>@|            PAP          >@ |          PAW?          QI>4
                                        0 | 0         |                0                          l
                                        0}   1       ]                 0                          l
                                         1/0         0                 0                          1
                                         1}  1       1                 ]                          1

EXAMPLE 2.11        a) Foran application of the second substitution rule, let P denote the compound statement
                       (p > q) > r. Because (p > q) <> 7p V q (asshownin Example 2.7 and Table 2.6),
                       if P,; denotes the compound statement (—p Vv q) — r, then P; <} P. (We also find
                       that [(p > g) > r] + [(4p V qg) > r] isa tautology.)
                   b) Now let P represent the compound statement (actually a tautology) p > (p Vv q).
                      Since —~p <> p, the compound statement P;: p —> (-—p v q) is derived from P
                      by replacing only the second occurrence (but not the first occurrence) of p by ——p.
                       The   second   substitution   rule   still implies         that   P, <>    P.    [Note   that   Py):   =73p—>
                       (—7p V q), derived by replacing both occurrences of p by —7p,                              is also logically
                       equivalent to P.]

Our next example demonstrates how we can use the idea of logical equivalence together
                 with the laws of logic and the substitution rules.

EXAMPLE   2.12   Negate and simplify the compound statement (p V g) >                      r.
                    We organize our explanation as follows:
                    1) (pV q) > r <= -(pvq) Vr [by the first substitution rule because
                       (s + t) > (7s V f) is a tautology for primitive statements s, f].
                    2) Negating the statements in step (1), we have “(pv                        q) >   r) << -7[-(@ vq) vr].
                    3) From the first of DeMorgan’s Laws and the first substitution rule,
                       TIA(P Vg) VF] = >(pv gq) Arr.
                    4) The Law of Double Negation and the second substitution rule now gives us
                       “(PV
                         gq) Amr =                 (pVgq) Arr.
                 From steps (1) through (4) we have ~[(p V q) > r] =                       (pv q)A>r.

When we wanted to write the negation of an implication, as in Example 2.12, we found
                 that the concept of logical equivalence played a key role — in conjunction with the laws of
                 logic and the substitution rules. This idea is important enough to warrant a second look.

EXAMPLE 2.13     Let p,q denote the primitive statements
                      p:     Joan goes to Lake George.            q:        Mary pays for Joan’s shopping spree.

and consider the implication

p—>q:       IfJoan goes to Lake George, then Mary will pay for Joan’s shopping spree.
62         Chapter 2 Fundamentals of Logic

Here we want to write the negation of p > gq ina way other than simply -(p — q). We
                           want to avoid writing the negation as “It is not the case that if Joan goes to Lake George,
                           then Mary will pay for Joan’s shopping spree.”
                               To accomplish this we consider the following. Since p > g <> 7p V q, it follows that
                           (p> q) <> -(CPp V q). Then by DeMorgan’s           Law we have —(—p V g) <> -—p A-4,
                           and from the Law of Double Negation and the second substitution rule it follows that
                           4p A -q <> p A 7g. Consequently,

(Pp > gq) = ACP
                                                             V 9g) SS              Tp AWG = PAA",
                           and we may write the negation of p — q in this case as
                                             —(p —> q):     Joan goes to Lake George, but Mary does not
                                                            pay for Joan’s shopping spree.

(Note: The negation of an if-then statement does not begin with the word if. It is not another
                            implication.)

In Definition 2.3 the dual s¢ of a statement s was defined only for statements involving
     EXAMPLE 2.14
                           negation and the basic connectives A and V. How does one determine the dual of a statement
                           such as s: p —> q, where p, qg are primitive?
                              Because (p > g) <> —p V q, S@ is logically equivalent to the statement (—p Vv g)¢,
                           which is —p A q.

The implication p — g and certain statements related to it are now examined in the
                            following example.

Table 2.13 gives the truth tables for the statements p—g, -q               ~> —p,   g > p,   and
     EXAMPLE 2.15
                           —p — gq. The third and fourth columns of the table reveal that

(p> qa) <= (-¢ > 7p).
                                               Table 2.13

P|@|       Pm@)rvaatp            |) aap)    apa |e

0 | 0         l           1          1         ]
                                                 0}   1        1            l         0         0
                                                  1/0         0            0          1         ]
                                                  1]  1       ]            1          ]         ]

The statement —g — —p is called the contrapositive of the implication p — g. Columns
                           5 and 6 of the table show that

(q> Pp) =         (p>   -¢).
                           The statement g —> p is called the converse of p + q; —p > —q is called the inverse of
                           p — q. We also see from Table 2.13 that

(p>qg<A(q->p)           and        (7p> 4G) 4 (-q > 7p).
                           Consequently, we must keep the implication and its converse straight. The fact that a certain
                           implication p —> q is true (in particular, as in row 2 of the table) does not require that the
                                                                  2.2 Logical Equivalence: The Laws of Logic     63

converse g —> p also be true. However, it does necessitate the truth of the contrapositive
                    ag —> =p.
                       Let us consider a specific example where p, g represent the statements

p:   Jeff is concerned about his cholesterol (HDL and LDL) levels.
                                 q:   Jeff walks at least two miles three times a week.

Then we obtain

e   (The implication:   p —   q). If Jeff is concerned about his cholesterol levels, then he
                       will walk at least two miles three times a week.
                      e (The contrapositive: ~q + —p). If Jeff does not walk at least two miles three times a
                      week, then he is not concerned about his cholesterol levels.
                      e (The converse: gq — p). If Jeff walks at least two miles three times a week, then he is
                      concerned about his cholesterol levels.
                      e (The inverse: ~p —> —q). If Jeff is not concerned about his cholesterol levels, then he
                      will not walk at least two miles three times a week.

If p is true and q is false, then the implication p — g and the contrapositive -qg > —p
                    are false, while the converse g — p and the inverse ~p — - are true. For the case where
                    p is false and q is true, the implication p — g and the contrapositive -~g —» —p are now
                    true, while the converse g > p andthe inverse —p — -g are false. When p, g are both true
                    or both false, then the implication is true, as are the contrapositive, converse, and inverse.

We turn now to two examples involving the simplification of compound statements. For
                    simplicity, we shall list the major laws of logic being used, but we shall not mention any
                    applications of our two substitution rules.

| EXAMPLE 2.16 _|   For primitive statements p, g, is there any simpler way to express the compound statement
                    (p Vg) A7(—p A gq) — that is, can we find a simpler statement that is logically equivalent
                    to the one given?
                        Here one finds that
                             (pV gq) A7(—p ag)               Reasons
                       <>    (PV q)A(-7p Vv 79g)             DeMorgan’s Law
                       =     (pV q)A(pv-7@q)                 Law of Double Negation
                       <=>   (p V (¢q A7q)                   Distributive Law of V over A
                       <=>   pv Fo                           Inverse Law
                       <=>   p                               Identity Law
                    Consequently, we see that

(PV GQ) ATT PADSP,
                    sO we can express the given compound statement by the simpler logically equivalent state-
                    ment p.

Consider the compound statement
  EXAMPLE 2.17
                                                        “TL     Vg) Ar) vy 4),
64      Chapter 2. Fundamentals of Logic

where p, q,r are primitive statements. This statement contains four occurrences of primitive
                         statements, three negation symbols, and three connectives.
                            From the laws of logic it follows that
                                  -—[-[(p V g) Ar] V 74]                  Reasons
                             <>   —-[(p Vg) Ar] A779                      DeMorgan’s Law
                             SS   [(PVgArlag                              Law of Double Negation
                             =    (pVqg)A(rag)                            Associative Law of A
                             <>   (PV G)A(GAr)                            Commutative Law of A
                             S(pvgAaqglAr                                 Associative Law of A
                             Sqar                                         Absorption Law (as well as the
                                                                            Commutative Laws for A and v)

Consequently, the original statement

“Ile Vg) Ar] Vv 7q]
                         is logically equivalent to the much simpler statement

q AP,
                         where we find only two primitive statements, no negation symbols, and only one connective.
                            Note further that from Example 2.7 we have

“llpV gq) Ar] > -q] = -I-lp Vv a Arlv -@),
                         so it follows that

Ar]pV
                                                            a[[(        qaAr.
                                                                  > mq]q)

We close this section with an application on how the ideas in Examples 2.16 and 2.17 can
                         be used in simplifying switching networks.

| EXAMPLE 2.18           A switching network is made up of wires and switches connecting two terminals 7; and
                          T>. In such a network, each switch is either open (0), so that no current flows through it, or
                          closed (1), so that current does flow through it.
                             In Fig. 2.1(a) we have a network with one switch. Each of parts (b) and (c) contains two
                          (independent) switches.

p

o——_      »p ———_-*           —                  /-—@      o—    p—   q—-*
                                     qT;                    Ts;     qT,                   tr     qT,               T5

g
                                    (a)                             (0)                          (c)
                                  Figure 2.1

For the network in part (b), current flows from 7; to T, if either of the switches p, g is
                          closed. We call this a parallel network and represent it by p V q. The network in part (c)
                                                 2.2 Logical Equivalence: The Laws of Logic           65

requires that each of the switches p, g be closed in order for current to flow from 7; to 7.
  Here the switches are in series; this network is represented by p A q.
     The switches in a network need not act independently of each other. Consider the network
  shown in Fig. 2.2(a). Here the switches labeled t and —f are not independent. We have
  coupled these two switches so that ¢ is open (closed) if and only if —f is simultaneously
  closed (open). The same is true for the switches at g, —g. (Also, for example, the three
  switches labeled p are not independent.)

p               p             p

e——         9               t            1t-—7;—e
   yy                                                     qr        q

r       ——   -g ——            r                                r

(a)                                                               (b)
Figure 2.2

This network is represented by the statement            (pVqVvr)A(pVtVv-q)A
  (p V —t Vr). Using the laws of logic, we may simplify this statement as follows.

(PVqGVrnA(pVvtv~7qgA(pv-7tvr)                                Reasons
        — PVIGVTJAEV
                mg) A(t VT)]                                              Distributive Law of V
                                                                             over A
        <> pVigvryattvrnAtv    -@)]                                       Commutative Law of A
        <> pVI(g A~t)
                  Vr At Vv =9g)]                                          Distributive Law of V
                                                                             over A
        <> PpVIUG Ant) Vr) A(o-t Vv aq)                                   Law of Double Negation
        <> pV (gq A-t) Vr) A(t Ag)}                                       DeMorgan’s Law
        => pV[=(>ot Ag) A(t Ag) vr]                                       Commutative Law of A
                                                                             (twice)
        => PVM OTA QA MAG) V (At Ag An}                                   Distributive Law of A
                                                                             over V
        <> pV [Fo V (ot     Ag     Ar]                                    —s5 As <=> Fo, for any
                                                                             statement s
        <> pV (A(t Ag) Ar]                                                Fo is the identity for Vv
        => pVvi[rAn(etAg)                                                 Commutative Law of A
        <> pVir AV     -7q))                                              DeMorgan’s Law and
                                                                             the Law of Double
                                                                             Negation

Hence (pVq@Vr)A(pVtVr-qA(pv                ~tvr) pv <=      [raA(tv—-gq)], and the net-
work shown in Fig. 2.2(b) is equivalent to the original network in the sense that current
66               Chapter 2 Fundamentals of Logic

flows from 7; to 7; in network (a) exactly when it does so in network (b). But network (b)
                                 has only four switches, five fewer than network (a).

a) [f0+0=0,        then] +1=1.
                           EXERCISES 2.2
                                                                         b) If —1 <3 and3 +7 = 10, then sin (4) = —1.
1. Let p, g, r denote primitive statements.                         10. Determine whether each of the following is true or false.
     a) Use truth tables to verify the following logical equiva-     Here p, g are arbitrary statements.
     lences.                                                             a) An equivalent way to express the converse of “p is
                                                                         sufficient for g” is “p is necessary for g.”
           i) Pp? G@ANS(prgaAprr)
         i) (pVgorneSelponaAdq-r)]                                       b) An equivalent way to express the inverse            of “p    is
        iii) [p>      Vn) eS[-r> (pq)                                    necessary for g” is “—g 1s sufficient for ~p.”
     b) Use the substitution rules to show that                          c) An equivalent way to express the contrapositive              of
                                                                         ‘‘p is necessary for g” is “ng is necessary for ~p.”
                   [P> vr)            [pr-q) > 7).
                                                                     11, Let p, g, andr denote primitive statements. Find a form of
2. Verify the first Absorption Law by means of a truth table.
                                                                     the contrapositive of p — (gq — r) with (a) only one occurrence
  3. Use the substitution rules to verify that each of the follow-   of the connective —; (b) no occurrences of the connective >.
ing is a tautology. (Here p, g, and r are primitive statements.)
     aIpvV(@Ar)|VoIpVv                (Ar)                           12. Show that for primitive statements p, g,

b) (pv qg)7>r)olcor> 7pvg))                                             PY ¢ = [(pA79q) Vv (=p Aq)) = A(p © q).
4. For primitive statements p, g, r, and s, simplify the com-       13. Verify that [(poegA(qeornArop)]S
pound statement                                                      [(p—> gq) A(q>r) AC     p)], for primitive statements p,
                                                                     g,andr.
         (lpAgMArIVI(pAgdAnrrl] Vv 7g]                      s.
5. Negate and express each of the following statements in           14. For primitive statements p, q,
smooth English.                                                          a) verify that p >    [¢ —   (p Aq)] is a tautology.
     a) Kelsey will get a good education if she puts her studies         b) verify that (p V gq) > [q > q] is a tautology by using
     before her interest in cheerleading.                                the result from part (a) along with the substitution rules and
     b) Norma is doing her homework, and Karen is practicing             the laws of logic.
     her piano lessons.                                                  ¢) is (pV q) > [¢ >        (pAq)Ja tautology?
     c) If Harold passes his C++ course and finishes his data        15. Define the connective “Nand” or “Not... and...” by
     structures project, then he will graduate at the end of the
                                                                     (p tq) <= -(p Aq), for any statements p, g. Represent the
     semester.
                                                                     following using only this connective.
6. Negate each of the following and simplify the resulting
                                                                         a) =p                 b) pVq                c) pAg
statement.
                                                                         d) p> q               e) pog
     a) PA(GVT)A(A=pV79qVr)                                          16. The   connective   “Nor”   or “Not   ... or...”   is defined   for
     b) (PAG>r                                                       any statements p,q by (p | gq) <> -(p V q). Represent the
     ce) p>      (-g Ar)                                             statements in parts (a) through (e) of Exercise 15, using only
     d) pVvqgv
          (7p Aq Ar)                                                 this connective.

7. a) If p, g are primitive statements, prove that                  17, For any statements p, g, prove that
                 (“PV Q)A(PA(PNG)
                                =                  (PAQ).                a) —(p 1g) = (-p t 79)
     b) Write the dual of the logical equivalence in part (a).           b) -(p tq) = (spl 79)
8. Write the dual for (a) g > p, (b) p>       (g Ar), (C) pod,      18. Give the reasons for each step in the following simplifica-
and (d) p VY g, where p, g, and r are primitive statements.          tions of compound statements.
9. Write the converse,     inverse, and contrapositive of each of
                                                                        a)         (pV gy A(py-qgivag                Reasons
the following implications. For each implication, determine its                =   [PVG A-qlvVq
truth value as well as the truth values of its corresponding con-              => (pV Fu) vq
verse, inverse, and contrapositive.                                            =   pvg
                                                                              2.3 Logical Implication: Rules of Inference          67

~F

ef                                               p            —p—r—t—

7g

i               r                  Na                                                  |

te                                         L p—q—-r—                     t

(a)
                                  =F

Figure 2.3

b)        (p>     @QAlr-g A (rv 74))}            Reasons            19. Provide the steps and reasons, as in Exercise 18, to establish
     =    (p> q)A7¢@                                                the following logical equivalences.
     =    (“pV gyAn7q                                                   a) pVIpA(PVQ)|
                                                                                    =P
     =    7g A (7p Vq)                                                  b) pVqgV(mpArmqGAr)=Spvagvr
     <=   (-g Aap) Vv (7G Aq)
     <>   (-g   Amp) V       Fo                                         ¢) [mp
                                                                            V 7g) > (PAGAT)
                                                                                         SS PAG
     =    7q    A7p                                                 20. Simplify each of the networks shown in Fig. 2.3.
     =    7(q V p)

2.3
Logical Implication : Rules of Inference
                                  At the end of Section 2.1 we mentioned the notion of a valid argument. Now we will begin
                                  a formal study of what we shall mean by an argument and when such an argument is valid.
                                  This in turn will help us when we investigate how to prove theorems throughout the text.
                                     We start by considering the general form of an argument, one we wish to show is valid.
                                  So let us consider the implication

(pi A po A p3 A+++       pn) > |:
                                     Here n is a positive integer, the statements p1, po, p3,.--, Pn are called the premises
                                  of the argument, and the statement g is the conclusion for the argument.
                                     The preceding argument is called valid if whenever each of the premises p1, p2, 13, ...,
                                  Pn is true, then the conclusion g is likewise true. [Note that if any one of
                                  P\; P2, P3,---; Pn is false, then the hypothesis pj A p2 A p3 A--- A p, 1s false and the
                                  implication (p; A p2 A p3 A--+A Pn) > g is automatically true, regardless of the truth
                                  value of g.] Consequently, one way to establish the validity of a given argument is to show
                                  that the statement (p; A p2 A p3A--:A pn) > g is a tautology.
                                      The following examples illustrate this particular approach.

Let p, g, r denote the primitive statements given as
EXAMPLE 2.19
                                                           p.   Roger studies.
                                                           q:   Roger plays racketball.
                                                           r:   Roger passes discrete mathematics.
68         Chapter 2 Fundamentals of Logic

Now let pi, p2, p3 denote the premises
                                                 pi:        If Roger studies, then he will pass discrete mathematics.
                                                 p2:        If Roger doesn’t play racketball, then he’ll study.
                                                 p3:        Roger failed discrete mathematics.
                           We want to determine whether the argument

(pi \ p2 A p3) > |
                           is valid. To do so, we rewrite p1, p2, p3 as

Pur         por            Pz:        7G >   p       p3:       or

and examine the truth table for the implication

(p>
                                                                          r) A (7g > p) N71]
                                                                                           >4
                           given in Table 2.14. Because the final column in Table 2.14 contains all 1’s, the implication
                           is a tautology. Hence we can say that (p, A p2 A p3) > gq is a valid argument.

Table 2.14

A            Pr            P3              (pi A p2 A p3) > 4q
                                         piqai\|r|lpor|-qopl                                  aor |     (ponalqe-            paarlog

0/01!         0            l               0          l                        l
                                         0;   0]       1            1               0         0)                        1
                                         0}      10                 ]               ]          1                        ]
                                         QO;     141                 1              1         0                         1
                                          1     |0] 90              0               ]         1                         1
                                         1|/0/]        1             1              1         0                         ]
                                         1}1{/90                    0               1         1                         1
                                         1/1/11                      ]              1         0                         1

Let us now consider the truth table in Table 2.15. The results in the last column of this table
     EXAMPLE 2.20
                            show that for any primitive statements p, r, and s, the implication

[pPA(par)>s)]>~                  (rs)

Table 2.15

Pi                                          P2                  q                (Pi A pr) > 4
                                    pir|s               |    par|         (panos          |    ros | (papanos|o(ros)

0|0]0                      0                1                  ]                         ]
                                    0  }|0} 1                  0                ]                  ]                         ]
                                    0}         10              0                1                  0                         ]
                                    QO};1)]1                   0               1                   ]                         ]
                                    1|0]    90                 0               1                   1                         1
                                    1          1o]{1           0               1                   1                         1
                                    1          1 | 0           1               0                   0                             ]
                                    1          1/1             1                l                   1                            1
                                                              2.3    Logical Implication: Rules of Inference         69

is a tautology. Consequently, for premises

Py:     p      pz:      (pAr)>s
                 and conclusion q: (r > s), we know that (p; A p2) > q is a valid argument, and we may
                 say that the truth of the conclusion g is deduced or inferred from the truth of the premises
                 Pi, P2-

The idea presented in the preceding two examples leads to the following.

Definition 2.4   If p, qg are arbitrary statements such that p — q is a tautology, then we say that p logically
                 implies q and we write p => q to denote this situation.

When p, g are statements and p => q, the implication p—> q is a tautology and we
                 refer to p — g as a logical implication. Note that we can avoid dealing with the idea of a
                 tautology here by saying that p > gq (that is, p logically implies g) if g is true whenever p
                 is true,
                     In Example 2.6 we found that for primitive statements p, g, the implication p > (p Vv q)
                 is a tautology. In this case, therefore, we can say that p logically implies p Vv g and write
                 p= (pv 4q). Furthermore, because of the first substitution rule, we also find that p >
                 (p V q) for any statements p, g —that is, p > (p V q) is a tautology for any statements
                 Pp, q, whether or not they are primitive statements.
                     Let p, g be arbitrary statements.

1) If p <> q, then the statement p < gq is a tautology, so the statements p, g have the
                       same (corresponding) truth values. Under these conditions the statements p > q,
                       q —> p are tautologies, and we have p > g andg > p.
                    2) Conversely, suppose that p = q and gq => p. The logical implication p — q tells us
                       that we never have statement p with the truth value | and statement g with the truth
                       value 0. But could we have g with the truth value 1 and p with the truth value 0?
                       If this occurred, we could not have the logical implication g — p. Therefore, when
                       p= q and q => p, the statements p, g have the same (corresponding) truth values
                       and p <> q.
                 Finally, the notation p # gq is used to indicate that p — q is not a tautology — so the given
                 implication (namely, p —> q) is not a logical implication.

From the results in Example 2.8 (Table 2.9) and the first substitution rule, we know that for
EXAMPLE 2.21
                 statements p, q,

“(Pp Aq) = 7p Vv 74.
                 Consequently,

—(p Ag) => (spv-q)           and      (“pv -7q) => -(p \q)
                 for all statements p, g. Alternatively, because each of the implications

—(p Aq) > (=pvn~q)           and      (-pv-q) > -(p Aq)
                 is a tautology, we may also write

I-(p Aq) > (7 pV -g)] <7)          and       [(ap Vv -g) > 7(p A g)) =              To.
70   Chapter 2 Fundamentals of Logic

Returning now to our study of techniques for establishing the validity of an argument, we
                     must take a careful look at the size of Tables 2.14 and 2.15. Each table has eight rows. For
                     Table 2.14 we were able to express the three premises p;, p2, and p3, and the conclusion
                     q, in terms of the three primitive statements p, g, and r. A similar situation arose for the
                      argument we analyzed in Table 2.15, where we had only two premises. But if we were
                     confronted, for example, with establishing whether

[por       aAr>s)AtV
                                                      as) A (At Vu) Atul] > ap

is a logical implication (or presents a valid argument), the needed table would require
                     2° = 32 rows. As the number of premises gets larger and our truth tables grow to 64, 128,
                     256, or more rows, this first technique for establishing the validity of an argument rapidly
                     loses its appeal.
                         Furthermore, looking at Table 2.14 once again, we realize that in order to establish
                     whether

[(p>r)A(-q>          p)A-r|l>4q

is a valid argument, we need to consider only those rows of the table where each of the three
                     premises p > r,—7q — p,and-r has the truth value 1. (Remember that if the hypothesis —
                     consisting of the conjunction of all of the premises   — is false, then the implication is true
                     regardless of the truth value of the conclusion.) This happens only in the third row, so a
                     good deal of Table 2.14 is not really necessary. (It is not always the case that only one row
                     has all of the premises true. Note that in Table 2.15 we would be concerned with the results
                      in rows 5, 6, and 8.)
                          Consequently, what these observations are telling us is that we can possibly eliminate a
                      great deal of the effort put into constructing the truth tables in Table 2.14 and Table 2.15. And
                      since we want to avoid even larger tables, we are persuaded to develop a list of techniques
                      called rules of inference that will help us as follows:
                         1) Using these techniques will enable us to consider only the cases wherein all the
                            premises are true. Hence we consider the conclusion only for those rows of a truth
                            table wherein each premise has the truth value 1— and we do not construct the truth
                            table.
                         2) The rules of inference are fundamental in the development of a step-by-step validation
                            of how the conclusion q logically follows from the premises p}, p2, p3,...,           Pn in
                            an implication of the form

(Pi A p2 A p3 A+++ A Pn) > q.

Such a development will establish the validity of the given argument, for it will show
                            how the truth of the conclusion can be deduced from the truth of the premises.
                          Each rule of inference arises from a logical implication. In some cases, the logical
                      implication is stated without proof. (However, several of these proofs will be dealt with in
                      the Section Exercises.)
                          Many rules of inference arise in the study of logic. We concentrate on those that we need
                      to help us validate the arguments that arise in our study of logic. These rules will also help
                      us later when we turn to methods for proving theorems throughout the remainder of the
                      text. Table 2.19 (on p. 78) summarizes the rules we shall now start to investigate.
                                                                         2.3     Logical Implication: Rules of inference           71

For a first example we consider the rule of inference called Modus Ponens, or the Rule of
EXAMPLE 2.22
               Detachment. (Modus Ponens comes from Latin and may be translated as “the method of
               affirming.”) In symbolic form this rule is expressed by the logical implication

[pPA(p>
                                                                Ql] 4,
               which is verified in Table 2.16, where we find that the fourth row is the only one where both
               of the premises p and p — g (and the conclusion q) are true.

Table 2.16

P\@ad|p>qi|                 patp>@q) | (pAtpoglr-g
                                      0 | 0            1                 0                         ]
                                      0}   1           1                 0                         1
                                      1|0              0                 0                         1
                                      1}    1          1                 1                         1

The actual rule will be written in the tabular form

p
                                                                         pq
                                                                         |
               where the three dots (.°, ) stand for the word “therefore,” indicating that g is the conclusion
               for the premises p and p — q, which appear above the horizontal line.
                   This rule arises when we argue that if (1) p is true, and (2) p > gq is true (or p> q),
               then the conclusion g must also be true. (After all, if g were false and p were true, then we
               could not have p — q true.)

The following valid arguments show us how to apply the Rule of Detachment.
                    a) 1) Lydia wins a ten-million-dollar lottery.                                                              p
                       2) If Lydia wins a ten-million-dollar lottery, then Kay will quit her job.                               pPp-q
                        3) Therefore Kay will quit her job.                                                                Wg
                    b) 1) If Allison vacations in Paris, then she will have to win a scholarship.                               p->q
                       2) Allison is vacationing in Paris.                                                                      p
                        3) Therefore Allison won a scholarship.                                                            Wg

Before closing the discussion on our first rule of inference let us make one final ob-
               servation. The two examples in (a) and (b) might suggest that the valid argument
               [p \(p > q)]— q is appropriate only for primitive statements p, g. However,
               since [p A (p — q)] > q 1s a tautology for primitive statements p, g, it follows from the
               first substitution rule that (all occurrences of) p or g may be replaced by compound state-
               ments — and the resulting implication will also be a tautology. Consequently, if r, s, t, and
               u are primitive statements, then

rvs
                                                              (r Vs)         >   (+t Au)
                                                           SE      AU

is   a   valid   argument,       by   the   Rule     of       Detachment—just           as   [7 Vs)A[(rVs)—>
               (—t A u)]] >      (+t A u) is a tautology.
72         Chapter 2 Fundamentals of Logic

A similar situation — in which we can apply the first substitution rule — occurs for each
                            of the rules of inference we shall study. However, we shall not mention this so explicitly
                            with these other rules of inference.

A second rule of inference is given by the logical implication
     EXAMPLE 2.23
                                                            (p> gAq>rl>              (pr),
                            where p, g, and r are any statements. In tabular form it is written

p>q
                                                                            G7>r

“por

This rule, which is referred to as the Law of the Syllogism, arises in many arguments. For
                            example, we may use it as follows:
                               1) If the integer 35244 is divisible by 396, then the integer 35244 is
                                  divisible by 66.                                                                p>q
                               2) If the integer 35244 is divisible by 66, then the integer 35244 is
                                  divisible by 3.                                                                 q7Tr
                               3) Therefore, if the integer 35244 is divisible by 396, then the integer
                                  35244 is divisible by 3.                                                       “por

The next example involves a slightly longer argument that uses the rules of inference
                            developed in Examples 2.22 and 2.23. In fact, we find here that there may be more than one
                            way to establish the validity of an argument.

Consider the following argument.
     EXAMPLE 2.24
                                1) Rita is baking a cake.
                               2) If Rita is baking a cake, then she is not practicing her flute.
                               3) If Rita is not practicing her flute, then her father will not buy her a car.
                               4) Therefore Rita’s father will not buy her a car.

Concentrating on the forms of the statements in the preceding argument, we may write
                            the argument as

p                                                (*)
                                                                        p>     —q
                                                                        —q--r

2

Now we need no longer worry about what the statements actually stand for. Our objective
                            is to use the two rules of inference that we have studied so far in order to deduce the iruth
                            of the statement —r from the truth of the three premises p, p > —-g, and -qg — —r.
                                                             2.3 Logical Implication: Rules of Inference         73

We establish the validity of the argument as follows:
                  Steps                  Reasons
                  1) p> -g               Premise
                  2) -q > -r             Premise
                  3) poor                This follows from steps (1) and (2) and the Law of the Syllogism
                  4) p                   Premise
                  5) o.-r                This follows from steps (4) and (3) and the Rule of Detachment
                   Before continuing with a third rule of inference we shall show that the argument presented
               at (*) can be validated in a second way. Here our “reasons” will be shortened to the form
               we shall use for the rest of the section. However, we shall always list whatever is needed
               to demonstrate how each step in an argument comes about, or follows, from prior steps.
                  A second way to validate the argument follows.
                  Steps                  Reasons
                  1)   p                 Premise
                  2)   p>      -¢q       Premise
                  3)   =q                Steps (1) and (2) and the Rule of Detachment
                  4)   -g--r            Premise
                  5)   ..-9r             Steps (3) and (4) and the Rule of Detachment

The rule of inference called Modus Tollens is given by
EXAMPLE 2.25
                                                             p~>q
                                                          — 4
                                                           “4p
               This follows from the logical implication [(p > g) A —q] > —p. Modus Tollens comes
               from Latin and can be translated as “method of denying.” This is appropriate because we
               deny the conclusion, g, so as to prove —p. (Note that we can also obtain this rule from the
               one for Modus Ponens by using the fact that p > g <> -q — —p.)
                  The following exemplifies the use of Modus Tollens is making a valid inference:
                  1)   If Connie is elected president of Phi Delta sorority, then Helen will
                       pledge that sorority.                                                                p->q
                  2)   Helen did not pledge Phi Delta sorority.                                             —q
                  3)   Therefore Connie was not elected president of Phi Delta sorority.                   “ap
                  And now we shall use Modus Tollens to show that the following argument is valid (for
               primitive statements p, r,s, ft, and u).

por
                                                            rs
                                                            tVv-s
                                                            tVu
                                                            uu
                                                          J. tp

Both Modus Tollens and the Law of the Syllogism come into play, along with the logical
               equivalence we developed in Example 2.7.
74   Chapter 2 Fundamentals of Logic

Steps                           Reasons
                           1) pornr-s                    Premises
                          2) pos                         Step (1) and the Law of the Syllogism
                          3) tVv-7s                      Premise
                          4)        -svt                 Step (3) and the Commutative Law of v
                          5) sot                         Step (4) and the fact that ~s vt <> 5 +f
                          6) pot                         Steps (2) and (5) and the Law of the Syllogism
                          7)        -tVvu                Premise
                          8)        t>u                  Step (7) and the fact that ~t Vu    =   t->       u
                          9)        p> u                 Steps (6) and (8) and the Law of the Syllogism
                         10)        =u                   Premise
                         11)        ..-p                 Steps (9) and (10) and Modus Tollens

Before continuing with another rule of inference let us summarize what we have just
                     accomplished (and not accomplished). The preceding argument shows that

[(prornaArosyAGV
                                                    7S) A (+t Vu) Au] > Ap.

We have not used the laws of logic, as in Section 2.2, to express the statement

(pornaA(roas\AEV-AS)A
                                                               (Ft Vu)A-Uu

as a simpler logically equivalent statement. Note that

[(prrnArros)AGV
                                                   7S) A (Ht Vu)Arul A> ap.

For when p has the truth value 0 and u has the truth value 1, the truth value of —p is 1 while
                     that of ~u and (p>          r)A(r    >   s)A(t      V ms) A (+t V 4) A mu is 0.

Let us once more examine a tabular form for each of the two related rules of inference,
                     Modus Ponens and Modus Tollens.

Modus Ponens:          p-—> gq         Modus Tollens:              p— q
                                                                   Pp                                          74
                                                                                                       a       ap
                                                              os q

The reason we wish to do this is that there are other tabular forms that may arise   — and
                     these are similar in appearance but present invalid arguments — where each of the premises
                     is true but the conclusion is false.
                         a) Consider the following argument:
                            1) If Margaret Thatcher is the president of the United States, then
                               she is at least 35 years old.                                                          pq
                               2)    Margaret Thatcher is at least 35 years old.                                      q
                               3) Therefore Margaret Thatcher is the president of the United States.   “.?p
                               Here we find that [(p — g) A q]— p isnot a tautology. For if we consider the truth
                               value assignments p: 0 and qg: 1, then each of the premises p — g and g is true
                               while the conclusion p is false. This invalid argument results from the fallacy
                               (error in reasoning) where we try to argue by the converse—that is, while
                               [(p > gq) A p] = @q, itis not the case that [(p > g) Aq] > p.
                                                              2.3 Logical implication: Rules of Inference             75

b) Asecond argument where the conclusion doesn’t necessarily follow from the premises
                     may be given by:
                     1) If2+3=6, then2+4=6.                                                      pq
                       2) 2+3 #6.                                                                            m2
                       3) Therefore 2 + 4 # 6.                                                              vq
                           In this case we find that [(p — q) Ap] > 774 is not a tautology. Once again
                       the truth value assignments p: 0 and q: | show us that the premises p > g and —p
                       can both be true while the conclusion —gq is false. The fallacy behind this invalid
                       argument arises from our attempt to argue by the inverse—for although
                       [(p > gq) Aq] => —p, it does not follow that [(p > g) A =p] > 79g.

Before proceeding further we now mention a rather simple but important rule of infer-
               ence.

The following rule of inference arises from the observation that if p, g are true statements,
EXAMPLE 2.26
               then p A q is a true statement.
                   Now suppose that statements p, g occur in the development of an argument. These
               statements may be (given) premises or results that are derived from premises and/or from
               results developed earlier in the argument. Then under these circumstances the two statements
               p,q can be combined into their conjunction p A q, and this new statement can be used in
               later steps as the argument continues.
                   We call this rule the Rule of Conjunction and write it in tabular form as

p
                                                              q
                                                            ..DAq

As we proceed further with our study of rules of inference, we find another fairly simple
               but important rule.

The following rule of inference — one we may feel just illustrates good old common sense —
EXAMPLE 2.27
               is called the Rule of Disjunctive Syllogism. This rule comes about from the logical impli-
               cation

[((pVq)A7p]>gq,
               which we can derive from Modus Ponens by observing that p                Vv g <> —p > q.
                  In tabular form we write

PY
                                                              —P
                                                            26g

This rule of inference arises when there are exactly two possibilities to consider and we are
               able to eliminate one of them as being true. Then the other possibility has to be true. The
               following illustrates one such application of this rule.
                  1) Bart’s wallet is in his back pocket or it is on his desk.                                   DV
                  2)   Bart’s wallet is not in his back pocket.                                               =p
                  3)   Therefore Bart’s wallet is on his desk.                                              Og
76         Chapter 2. Fundamentals of Logic

At this point we have examined five rules of inference. But before we try to validate any
                            more arguments like the one (with 11 steps) in Example 2.25, we shall look at one more
                            of these rules. This one underlies a method of proof that is sometimes confused with the
                            contrapositive method (or proof) given in Modus Tollens. The confusion arises because
                            both methods involve the negation of a statement. However, we will soon realize that these
                            are two distinct methods. (Toward the end of Section 2.5 we shall compare and contrast
                            these two methods once again.)

Let p denote an arbitrary statement, and Fo a contradiction. The results in column 5 of Table
     EXAMPLE 2.28
                            2.17 show that the implication (~p — Fo) ~ p is a tautology, and this provides us with
                            the rule of inference called the Rule of Contradiction. In tabular form this rule is written as

=p > Fo
                                                                                    Pp

Table 2.17

P|     7p |      ho | ~poh |                     (iH po
                                                                                                             kh)     p

1 |    0                          1                    1
                                                                        ©

O}      J        0                0                     1

This rule tells us that if p is astatement and =p —> Fp is true, then — p must be false because
                            Fo is false. So then we have p true.
                                The Rule of Contradiction is the basis of a method for establishing the validity of an
                            argument — namely, the method of Proof by Contradiction, or Reductio ad Absurdum. The
                            idea behind the method of Proof by Contradiction is to establish a statement (namely, the
                            conclusion of an argument) by showing that, if this statement were false, then we would
                            be able to deduce an impossible consequence. The use of this method arises in certain
                            arguments which we shall now describe.
                                In general, when we want to establish the validity of the argument

(pi A p2 A+++
                                                                               A pn) > q,
                            we can establish the validity of the logically equivalent argument

(pi   A   pr       A+++    A    py   Amq)    >   Fo.

[This follows from the tautology in column 7 of Table 2.18 and the first substitution rule —
                            where we replace the primitive statement p by the statement (p; A p2 A+++ A p,)'.]

Table 2.18

P\q@'\      Fo | pAanwq | (PA7Q™M—>h | po!                                   (pogel(pa-7q—
                                                                                                                           Fi)
                                  0101;       0        0                     l                      1                    1
                                  Oo]   1     0        0                     l                      1                    1
                                  1/0]        0        1                     0                      0                    ]
                                  1)   1      0        0                     I                      1                    1

"In Section 4.2 we shall provide the reason why we know that for any statements p), p2 weeny Pn, and q, it
                            follows that (pi A p2 A---A
                                                    pn) AmG = pl A pr A-+-A
                                                                         pa AnQ.
                                                              2.3    Logical Implication: Rules of Inference        77

When we apply the method of Proof by Contradiction, we first assume that what we are
               trying to validate (or prove) is actually false. Then we use this assumption as an additional
               premise in order to produce a contradiction (or impossible situation) of the form s A —s, for
               some statement s. Once we have derived this contradiction we may then conclude that the
               statement we were given was in fact true — and this validates the argument (or completes
               the proof).
                   We shall turn to the method of Proof by Contradiction when it is (or appears to be) easier
               to use 7g in conjunction with the premises p,, P2,.-.., Pp Mm order to deduce a contradiction
               than itis to deduce the conclusion g directly from the premises p), p2,...,              Pn. The method
               of Proof by Contradiction will be used in some of the later examples for this section—
               namely, Examples 2.32 and 2.35. We shall also find it frequently reappearing in other
               chapters in the text.

Now that we have examined six rules of inference, we summarize these rules and intro-
               duce several others in Table 2.19 (on the following page).

The next five examples will present valid arguments. In so doing, these examples will
               show us how to apply the rules listed in Table 2.19 in conjunction with other results, such
               as the laws of logic.

Our first example demonstrates the validity of the argument
EXAMPLE 2.29
                                                           p7r

“pq
                                                           qs
                                                           ors

Steps                Reasons
                  1) por               Premise
                  2) -r—>    4p        Step (1) and p> r <> -7r—>             ap
                  3) =p>q              Premise
                  4) -r>q              Steps (2) and (3) and the Law of the Syllogism
                  5) q->>s             Premise
                  6) ..-r>s            Steps (4) and (5) and the Law of the Syllogism
               A second way to validate the given argument proceeds as follows.
                  Steps             Reasons
                  1)   per          Premise
                  2)g->s            Premise
                  3) -=p>q          Premise
                  4) pvq            Step (3) and (-~p > q) <> (-7~p V q) = (p Vq), where the
                                       second logical equivalence follows by the Law of Double Negation
                  5rvs              Steps (1), (2), and (4) and the Rule of the Constructive Dilemma
                  6)   ..-r—>s      Step (S) and (r Vs)   <         (--7r Vs) &       (-r >     5), where the Law of
                                       Double Negation is used in the first logical equivalence

The next example is somewhat more involved.
              Chapter 2. Fundamentals of Logic

Table 2.19

Rule of Inference                             Related Logical Implication                   Name of Rule

1)           p                      IpPA(p>qQl->4                                         Rule of Detachment
                    pq                                                                             (Modus Ponens)
               2g
       2)           pq                     [(p-gaA@dr>nl7~            wr)                        Law of the Syllogism
                    q   ~r

..por

3)        pq                        (p>      4q)A7q) > =p                                 Modus Tollens
                 —7q
               “ap
       4)        p                                                                               Rule of Conjunction
                q
               “DAG
       5)           pvq                    [pV q)A7pl|> 4                                        Rule of Disjunctive
                    7P                                                                              Syllogism
               a!

6)          —=p—> Fo               (=p > Fo) > p                                         Rule of
               “.i~pP                                                                               Contradiction

7)          pA                     (p\q) > Pp                                            Rule of Conjunctive
               J. ?p                                                                                Simplification
        8)          p                      p>    pPpVvgq                                         Rule of Disjunctive
                    PVG                                                                            Amplification
        9)          pAg                     I(pAgA[p>@q@>r)]]r-r                                 Rule of Conditional
                    p>       q>r)                                                                   Proof
                ?

10)           per                     lponaAqornlolpvg—-r]                                 Rule for Proof
                    q7r                                                                             by Cases
                        (PVgo>r
       11)          p-q                     [(prqQgdArosA(pvr]->@|vs)                            Rule   of the
                    r>s                                                                             Constructive
                    pvr                                                                             Dilemma
                    QNVs
      12)           p->q                    (p> g) A (F > 8s) A (7G V 75)])> ("pv 7r)            Rule of the
                    r—>s                                                                            Destructive
                    —™q V 7S                                                                        Dilemma
                    apV-r

EXAMPLE 2.30                      Establish the validity of the argument

pq
                                                                                  G7 {(rAs)
                                                                                  —r V (=t Vu)
                                                                                  pAt
                                                                             OU
                                                                 2.3   Logical Implication: Rules of Inference   79

Steps                         Reasons
                    1) p-g                      Premise
                    2) g>(rASs)                 Premise
                    3) po (ras)                 Steps (1) and (2) and the Law of the Syllogism
                    4) pat                      Premise
                    5) p                        Step (4) and the Rule of Conjunctive Simplification
                    6) rAs                      Steps (5) and (3) and the Rule of Detachment
                    7) r                        Step (6) and the Rule of Conjunctive Simplification
                    8) -r V (-t Vu)             Premise
                    9) -(rAt)Vu                 Step (8), the Associative Law of v, and DeMorgan’s Laws
                  10) ¢                         Step (4) and the Rule of Conjunctive Simplification
                  11) rAt                       Steps (7) and (10) and the Rule of Conjunction
                  12)    -.u                     Steps (9) and (11), the Law of Double Negation, and the
                                                   Rule of Disjunctive Syllogism

This example will provide a way to show that the following argument is valid.
EXAMPLE 2.31
                           If the band could not play rock music or the refreshments were not delivered
                        on time, then the New Year’s party would have been canceled and Alicia would
                        have been angry. If the party were canceled, then refunds would have had to be
                        made. No refunds were made.
                           Therefore the band could play rock music.
                  First we convert the given argument into symbolic form by using the following statement
               assignments:

The band could play rock music.
                                       SQ DY

The refreshments were delivered on time.
                                               The New Year’s party was canceled.
                                               Alicia was angry.
                                       &

Refunds had to be made.
                                       —~

The argument above now becomes

(=p V7q)—> (r As)
                                                        rot
                                                        —t
                                                     “.?p
               We can establish the validity of this argument as follows.
                  Steps                                Reasons
                  1) rot                               Premise
                  2)    -t                             Premise
                  3)    =r                             Steps (1) and (2) and Modus Tollens
                  4)    -rv-s                          Step (3) and the Rule of Disjunctive Amplification
                  5)    -(r As)                        Step (4) and DeMorgan’s Laws
                  6)    (=pV-7q)—>      (ras)          Premise
                  7)    =(4p V -q)                     Steps (6) and (5) and Modus Tollens
                  8)    pAg                            Step (7), DeMorgan’s Laws, and the Law of Double
                                                          Negation
                  9)    -.p                            Step (8) and the Rule of Conjunctive Simplification
      Chapter 2 Fundamentals of Logic

In this instance we shall use the method of Proof by Contradiction. Consider the argument
EXAMPLE 2.32
                                                                                “Pq
                                                                                qr
                                                                                —r
                                                                              “p

To establish the validity for this argument, we assume the negation — p of the conclusion p
                      as another premise. The objective now is to use these four premises to derive a contradiction
                       Fo. Our derivation follows.

Steps                                            Reasons
                             1) ~peg                                      Premise
                             2) (-p>q)A@>nap)                              Step (and (-p         4g) > [p>       qgA@-> -p))
                             3) -p-> gq                                   Step (2) and the Rule of Conjunctive Simplification
                             4)q-r                                        Premise
                             5) ap>r                                      Steps (3) and (4) and the Law of the Syllogism
                             6) —p                                        Premise (the one assumed)
                             7) r                                         Steps (5) and (6) and the Rule of Detachment
                             8)     -—r                                   Premise
                             9)     rAmr(<>          Fo)                  Steps (7) and (8) and the Rule of Conjunction
                          10)       -.p                                    Steps (6) and (9) and the method of Proof by
                                                                              Contradiction
                          If we examine further what has happened here, we find that

(mp
                                                              eo MAG               r) Avr Amp) => Fo.
                      This        requires   the    truth    value   of   [((~p<q)    A(q   —-r)A-rA-—p]       to be   0.    Because
                       =p <q,q-—-r,                and —r are the given premises, each of these statements has the truth
                       value 1. Consequently, for [(->p — g) A (¢ >                   r) Amr   A —p]to have the truth value 0, the
                       statement — p must have the truth value 0. Therefore p has the truth value 1, and the conclu-
                       sion p of the argument is true.

Before we consider our next example, we need to examine columns 5 and 7 of Table
                       2.20. These identical columns tell us that for primitive statements p, qg, andr,

Ip>(q7enl] elprgd-rl.

Using the first substitution rule, let us replace each occurrence of p by the compound
                       statement (p; A p2 A+++ A                p,). Then we obtain the new result

[(pLA p2A-++A Pa) > (g > r)] =                     (Cp A po A-+* A Pn AG)       > ry).

*In Section 4.2 we shall present a formal proof of why

(pi A pr A+++ A Pad AG =   PLA prArr+A pa Aq.
                                                                             2.3 Logical Implication: Rules of inference   81

Table 2.20

P\@i\r\                  pag |         (pagror|qrar)             podg-nr)

0,0)       0              Q                1              ]              ]
                                   0/0}           1          0                ]              ]             ]
                                   QQ;   1/0                 0                1              )             1
                                   Oo;   1] 1                0                1              1             ]
                                    1 | 0]    0              0                1              l             1
                                   1/0]           1          0               ]               ]             1
                                   1/;1/0                    1               0               0             0
                                   |e oe                     ]                ]              1             ]

This result tells us that if we wish to establish the validity of the argument (*) we may be
               able to do so by establishing the validity of the corresponding argument (**).

(*)        Pi                       (**)      P\
                                                                 P2                                 P2

Pr                                    Pn
                                                             gar                                    g
                                                                                                    OF

After all, suppose we want to show that g —>r has the truth value 1, when each of
               P1, P2,.-+, Pn does. If the truth value for g is 0, then there is nothing left to do, since
               the truth value for g — r is 1. Hence the real problem is to show that g — r has truth
               value 1, when each of p1, p2,..., Pn, and g does     — that is, we need to show that when
               Pi, P2, +++, Pay g each have truth value 1, then the truth value of r is 1.
                      We demonstrate this principle in the next example.

In order to establish the validity of the argument
EXAMPLE 2.33
               (*)                                                u>r
                                                                  (rAs)>          (pvt)
                                                                  q—>{uAs)
                                                                  —f
                                                                 “.G—> p

we consider the corresponding argument
               (**)                                                   u—or

(r As)
                                                                          > (pvt)
                                                                      q > UAS)
                                                                      t
                                                                   q
                                                                 2p

[Note that g is the hypothesis of the conclusion g — p for argument (*) and that it becomes
               another premise for argument (**) where the conclusion is p.]
82         Chapter 2. Fundamentals of Logic

To validate the argument (**) we proceed as follows.
                               Steps                         Reasons
                                 1) g                        Premise
                                2) go> As)                   Premise
                                3) uAS                       Steps (1) and (2) and the Rule of Detachment
                                4) u                         Step (3) and the Rule of Conjunctive Simplification
                                5) u-r                       Premise
                                6) r                         Steps (4) and (5) and the Rule of Detachment
                                 7)    s                     Step (3) and the Rule of Conjunctive Simplification
                                 8)    ras                   Steps (6) and (7) and the Rule of Conjunction
                                  9)   (rAs)—>(pVt)          Premise
                                10)    pvt                   Steps (8) and (9) and the Rule of Detachment
                                11)    -t                    Premise
                                12)    ..p                   Steps (10) and (11) and the Rule of Disjunctive Syllogism
                               We now know that for argument (**)

[(us>ryAlras)> (pvt            Alg>   UAS)
                                                                                     At Ag] => p,
                            and for argument (*) it follows that

(us rnalras)>
                                                    py) Alg > UAs) Amt] > @-                               p).

Examples 2.29 through 2.33 have given us some idea of how to establish the validity
                            of an argument. Following Example 2.25 we discussed two situations indicating when an
                            argument is invalid — namely, when we try to argue by the converse or the inverse. So now
                            it is time for us to learn a little more about how to determine when an argument is invalid.
                                 Given an argument
                                                                             Pi
                                                                             P2
                                                                             P3

Pn
                                                                        ig
                            we Say that the argument is invalid if it is possible for each of the premises p), p2, p3,..-
                            p, to be true (with truth value 1), while the conclusion g is false (with truth value 0).
                                The next example illustrates an indirect method whereby we may be able to show that
                            an argument we feel is invalid (perhaps because we cannot find a way to show that it is
                            valid) actually is invalid.

Consider the primitive statements p, g, r,s, and t and the argument
     EXAMPLE 2.34
                                                                     p
                                                                     pVq
                                                                     q—>(r->s)
                                                                     t—-r

To show that this is an invalid argument, we need one assignment of truth values for each
                            of the statements p, g, r,s, and t such that the conclusion —s —> —t is false (has the truth
                            value 0) while the four premises are all true (have the truth value 1). The only time the
                                                                2.3 Logical Implication: Rules of Inference   83

conclusion —s — —f is false is when —s is true and —t is false. This implies that the truth
               value for s is 0 and that the truth value for ¢ is 1.
                   Because p is one of the premises, its truth value must be 1. For the premise p Vv q to
               have the truth value 1, g may be either true (1) or false (0). So let us consider the premise
               t —> r where we know that ¢ is true. If f > r is to be true, then r must be true (have the
               truth value |). Now with rs true (1) and s false (0), it follows that r — s is false (0), and that
               the truth value of the premise g — (r > s) will be 1 only when g is false (0).
                   Consequently, under the truth value assignments

p:    1        g:     O                |          s:    0          t:    1,

the four premises

P         pVq           q>(r-s)                   tor

all have the truth value 1, while the conclusion

7S    —> —f

has the truth value 0. In this case we have shown the given argument to be invalid.

The truth value assignments p: 1, g: 0, r: 1, 5:0, and t: 1 of Example 2.34 provide one
               case that disproves what we thought might have been a valid argument. We should now
               start to realize that in trying to show that an implication of the form

(p1 A p2 A p3A+++A
                                                              Pra) >|
               presents a valid argument, we need to consider all cases where the premises p1, p2, P3,---,
               P, ate true. [Each such case is an assignment of truth values for the primitive statements
               (that make up the premises) where pj, p2, P3, .--, Pn are true.] In order to do so— namely,
               to cover the cases without writing out the truth table   — we have been using the rules of
               inference together with the laws of logic and other logical equivalences. To cover all the
               necessary cases, we cannot use one specific example (or case) as a means of establishing
               the validity of the argument (for all possible cases). However, whenever we wish to show
               that an implication (of the preceding form) is not a tautology, all we need to find is one
               case for which the implication is false— that is, one case in which all the premises are true
               but the conclusion is false. This one case provides a counterexample for the argument and
               shows it to be invalid.

Let us consider a second example wherein we try the indirect approach of Example 2.34.

What can we say about the validity or invalidity of the following argument? Here p, gq, r,
EXAMPLE 2.35
               and s denote primitive statements.)

pq
                                                                q->s
                                                                r>ss
                                                                —pYr
                                                                Sp

Can the conclusion —p be false while the four premises are all true? The conclusion —p
               is false when p has the truth value 1. So for the premise p — gq to be true, the truth value
               of g must be 1. From the truth of the premise g — 5s, the truth of g forces the truth of
               s. Consequently, at this point we have statements p, g, and s all with the truth value 1.
84            Chapter 2 Fundamentals of Logic

Continuing with the premise r + —s, we find that because s has the truth value 1, the truth
                               value of r must be 0. Hence r is false. But with —p false and the premise —p Y r true, we
                                also have r true. Therefore we find that p > (—r Ar).
                                    We have failed in our attempt to find a counterexample to the validity of the given
                                argument. However, this failure has shown us that the given argument is valid
                                                                                                           — and                         the
                                validity follows by using the method of Proof by Contradiction.

This introduction to the rules of inference has been far from exhaustive. Several of the
                               books cited among the references listed near the end of this chapter offer additional material
                               for the reader who wishes to pursue this topic further. In Section 2.5 we shall apply the ideas
                               developed in this section to statements of a more mathematical nature. For we shall want to
                               learn how to develop a proof for a theorem. And then in Chapter 4 another very important
                               proof technique called mathematical induction will be added to our arsenal of weapons for
                               proving mathematical theorems. First, however, the reader should carefully complete the
                               exercises for this section.

b) If Brady solved the first problem correctly, then the an-
                                                                          swer he obtained is 137.
                                                                          Brady’s answer to the first problem is not 137.
  1. The following are three valid arguments. Establish the va-
lidity of each by means of a truth table. In each case, determine
which rows of the table are crucial for assessing the validity of         c) If this is a repeat-until loop, then the body of this loop
the argument and which rows can be ignored.                               is executed at least once,

a [pA(p>Q@AriI>((pvg>r]                                             .. The body of the loop is executed at least once.
     b) [[(p Aq) > rl] A794 A (p> -r)] > (pv 79)                         d) If Tim plays basketball in the afternoon, then he will not
     ce) [[pPV (gv r)] A 7g] > (pvr)                                     watch television in the evening.
  2. Use truth tables to verify that each of the following is a
logical implication.                                                     .’. Tim didn’t play basketball in the afternoon.
     a) ((pogAq>rn)|o(por)                                             5. Consider each of the following arguments. If the argument
                                                                     is valid, identify the rule of inference that establishes its validity.
     b) (p> q)A7q]> 7p                                               If not, indicate whether the error is due to an attempt to argue
     ce) (pV gq) A7p)> 4                                             by the converse or by the inverse.
     d) pon a@g>rlolpyvqg-r)                                             a) Andrea can program in C++, and she can program                 in
3. Verify that each of the following is a logical implication by        Java.
showing that it is impossible for the conclusion to have the truth       Therefore Andrea can program in C++.
value 0 while the hypothesis has the truth value 1.                      b) A sufficient condition for Bubbles to win the golf tour-
     a) (pAq)>p                                                          nament is that her opponent Meg not sink a birdie on the
     b) p> (pvq)                                                         last hole.
                                                                         Bubbles won the golf tournament.
     ec) (pVq)A7p)>4                                                     Therefore Bubbles’ opponent Meg did not sink a birdie on
     d) (peQgMaras)A(pvrnj>@vs)                                          the last hole.
     e) (p> ga A> s)A ("GV 75)) > (4p V7)                                 c) If Ron’s computer program is correct, then he’ll be able
4. For each of the following pairs of statements, use Modus             to complete his computer science assignment in at most two
Ponens or Modus Tollens to fill in the blank line so that a valid        hours.
argument is presented.                                                   It takes Ron over two hours to complete his computer sci-
     a) If Janice has trouble starting her car, then her daughter        ence assignment.
     Angela will check Janice’s spark plugs.                             Therefore Ron’s computer program is not correct.
     Janice had trouble starting her car.                                d) Eileen’s car keys are in her purse, or they are on the
                                                                         kitchen table.
                                                                                    2.3        Logical Implication: Rules of Inference      85

Eileen’s car keys are not on the kitchen table.                      9. a) Give the reasons for the steps given to validate the
    Therefore Eileen’s car keys are in her purse.                           argument

e) If interest rates fall, then the stock market will rise.                        [p> g@ Alor Vs)A(pVr)] > (-q > 5).
    Interest rates are not falling.
    Therefore the stock market will not rise.                                    Steps                          Reasons
                                                                                  1)      -—(-q > s)
6. For primitive statements         p, g, and r, let P denote    the
                                                                                  2)      -g A7s
Statement
                                                                                  3)      x5
                  [PA@ Ar) V-IpV@Ar)),                                            4)      -=rvs
while P; denotes the statement                                                    5)      =r
                                                                                  6) p>q
                  [IPA   Vr] V-lpy @vr)).
                                                                                  7) 7q
    a) Use the rules of inference to show that                                    8)      =p
                               qAraaqgyvr.                                        9)      pvr
                                                                                 10)      r
    b) Is it true that P => P,?
                                                                                 ll)      -rar
7. Give the reason(s) for each step needed to show that the                     12)      ..-g>s
following argument is valid.
                                                                            b) Give a direct proof for the result in part (a).
          [PAP > QA            VIA (> 7q)] > (8 V8)
                                                                            c) Give a direct proof for the result in Example 2.32.
   Steps                   Reasons                                      10. Establish the validity of the following arguments.
                                                                            a) [((pA~q) Ar]> [pAr)vg)
                                                                            b) [PA(P>    QM ACaVNI>r
   4)     r>-qg                                                             c)      p->@                             d)        p-gq
   5)     g>>r                                                                      7q                                         rq
   6)     =r
                                                                                 eh                                       SS
   7)     sVr                                                                    SO(pvr)                                  /.ap
   8)     s
                                                                            e)         po       (qr)                 f)        pag
   9)     svt
                                                                                    7q > 7p                                    p—>(rAq)
8. Give the reasons     for the steps verifying     the following                 Pp                                          r—>(svf)
argument.                                                                        Ur                                            ss
                           (—=pVvq)>r                                                                                     Je

r—>(s          Vt)
                                                                            g)         po(q>r)                       h)        pvq
                           7S    A   TH                                             pV Ss                                      —pVr
                           =u > at                                                     f{—>q                                       Tr
                         .?p                                                        as                                    a.

or           ot
   Steps                             Reasons
    1) -s Au                                                            11. Show that each of the following arguments is invalid by
    2)     74                                                           providing a counterexample      — that is, an assignment of truth
     3)    -u—> -t                                                      values for the given primitive statements p, g, r, and s such
     4)    -t                                                           that all premises are true (have the truth value 1) while the con-
     5)    7-5                                                          clusion is false (has the truth value 0).
     6)    -—s Ant                                                          a) [((pA7q) A[p>                 (¢>r)\]>
                                                                                                                    7r
     7)    r>(sVt)
                                                                            b) [[(pAg >rl]A(-qvr)]—>                           p
     8)    -(s Vt)7>  Fr
     9)    (-s Ant) > ar                                                    c)         peg                           d)        p
                                                                                    q7r                                        pwr
   10)     -,r
                                                                                    rV-7s                                      p-o(qvr-r)
   11)     (-pvq)-or
                                                                                    45 >         q                             aq
                                                                                                                                V 78
   12)     -r — -(-p vq)
                                                                                   Ss                                     8
   13)     -r > (pA-7q)
   14) pA-@
   15) ..p
              Chapter 2 Fundamentals of Logic

12. Write each of the following arguments in symbolic form.              clauses (p V q) and (7p Vr) as premises and the clause
Then establish the validity of the argument or give a counter-           (g¢ Vr) asits conclusion (or, resolvent), Should we have the
example to show that it is invalid.                                      premise >(p A q), we replace this by the logically equiva-
    a) If Rochelle gets the supervisor’s position and works              lent clause —p V —q, by the first of DeMorgan’s Laws. The
    hard, then she’ get a raise. If she gets the raise, then she’ll      premise —(p V q) can be replaced by the two clauses —p,
    buy a new car. She has not purchased a new car. Therefore            —q. This is due to the second DeMorgan Law and the Rule
    either Rochelle did not get the supervisor’s position or she         of Conjunctive Simplification. For the premise p Vv (¢ Ar),
    did not work hard.                                                   we apply the Distributive Law of Vv over A and the Rule
                                                                         of Conjunctive Simplification to arrive at either of the two
    b) If Dominic goes to the racetrack, then Helen will be mad.
                                                                         clauses p V q, p Vr. Finally, the premise p — qg becomes
    If Ralph plays cards all night, then Carmela will be mad. If
                                                                         the clause ~p V q.
    either Helen or Carmela gets mad, then Veronica (their at-
                                                                             Establish the validity of the following arguments, using
    torney) will be notified. Veronica has not heard from either
                                                                         resolution (along with the rules of inference and the laws
    of these two clients. Consequently, Dominic didn’t make it
                                                                        of logic).
    to the racetrack and Ralph didn’t play cards all night.
                                                                              (i)    pVv(@qar)                Ce          2
    c) Ifthere is a chance of rain or her red headband is missing,                    ps                                  pq
    then Lois will not mow her lawn. Whenever the tempera-                          JFVS                             wg
    ture is over 80°F,   there is no chance    for rain. Today   the
                                                                            (ili)    pvgq                     (iv)        -mpVvqvr
    temperature is 85°F and Lois is wearing her red headband.
    Therefore (sometime today) Lois will mow her lawn.
                                                                                     por                                  7q
                                                                                     ros                                  ar
13. a) Given primitive      statements    p,g,7r,   show   that the                 “QNVS                            “.ap
    implication                                                              (v)     <pVs
                      (pVgyA(opvrnj>
                                 @vr)                                                —tV(sAr)
    is a tautology.                                                                  mq Vr
                                                                                     pVGNVt
    b) The tautology in part (a) provides the rule of inference
                                                                                    SINS
    known as resolution, where the conclusion (g¢ V r) is called
    the resolvent. This rule was proposed in 1965 by J. A. Robin-        c) Write the following argument in symbolic form, then
    son and is the basis of many computer programs designed              use resolution (along with the rules of inference and the
    to automate a reasoning system.                                      laws of logic) to establish its validity.
        In applying resolution each premise (in the hypothe-                Jonathan does not have his driver’s license or his new
    sis) and the conclusion are written as clauses. A clause is          car is out of gas. Jonathan has his driver’s license or he does
    a primitive statement or its negation, or it is the disjunc-         not like to drive his new car. Jonathan’s new car is not out
    tion of terms each of which is a primitive statement or the          of gas or he does not like to drive his new car. Therefore,
    negation of such a statement. Hence the given rule has the           Jonathan does not like to drive his new car.

2.4
               The Use of Quantifiers
                                In Section 2.1, we mentioned how sentences that involve a variable, such as x, need not
                                be statements. For example, the sentence “The number x + 2 is an even integer” is not
                                necessarily true or false unless we know what value is substituted for x. If we restrict our
                                choices to integers, then when x is replaced by —5, —1, or 3, for instance, the resulting
                                statement is false. In fact, it is false whenever x is replaced by an odd integer. When an
                                even integer is substituted for x, however, the resulting statement is true.
                                   We refer to the sentence “The number x + 2 is an even integer” as an open statement,
                                which we formally define as follows.

Definition 2.5            A declarative sentence is an open statement if

1) it contains one or more variables, and
                                                                     2.4 The Use of Quantifiers            87

2) it is not a statement, but
      3) it becomes a statement when the variables in it are replaced by certain allowable
         choices.

When we examine the sentence “The number x + 2 is an even integer” in light of
this definition, we find it is an open statement that contains the single variable x. With
regard to the third element of the definition, in our earlier discussion we restricted the
“certain allowable choices” to integers. These allowable choices constitute what is called
the universe or universe of discourse for the open statement. The universe comprises the
choices we wish to consider or allow for the variables) in the open statement. (The universe
is an example of a set, a concept we shall examine in some detail in the next chapter.)
    In dealing with open statements, we use the following notation:
    The open statement “The number x + 2 is an even integer” is denoted by p(x) [or g(x),
r(x), etc.]. Then —p(x) may be read “The number x + 2 is nof an even integer.”
    We shall use g(x, y) to represent an open statement that contains two variables. For
example, consider

q(x, y): | The numbers y + 2, x — y, and x + 2y are even integers.

In the case of g(x, y), there is more than one occurrence of each of the variables x, y. It is
understood that when we replace one of the x’s by a choice from our universe, we replace
the other x by the same choice. Likewise, when a substitution (from the universe) is made
for one occurrence of y, that same substitution is made for all other occurrences of the
variable y.
   With p(x) and g(x, y) as above, and the universe still stipulating the integers as our only
allowable choices, we get the following results when we make some replacements for the
variables x, y.

p(5):    The number 7(= 5 + 2) is an even integer. (FALSE)
                     —p(7):     The number 9 is not an even integer. (TRUE)
                    q(4, 2):    The numbers 4, 2, and 8 are even integers. (TRUE)

We         also note, for example,    that g(5, 2) and g(4, 7) are both false statements,            whereas
aq (5, 2) and         gq (4, 7) are true.
    Consequently, we see that for both p(x) andg(x, y), as already given, some substitutions
result in true statements and others in false statements. Therefore we can make the following
true statements.

1)                                      For some x, p(x).

2)                                     For some x, y, g(x, y).

Note        that in this situation,   the statements   “For   some    x, ~p(x)”     and   “For    some   x, y,
—=q(x, y)” are also true. [Since the statements “For some x, p(x)” and “For some x, —p(x)”
are both true, we realize that the second statement is not the negation of the first— even
though the open statement — p(x) is the negation of the open statement p(x). And a similar
result is true for the statements involving g(x, y) and ~q(x, y).]
    The phrases “For some x” and “For some x, y” are said to quantify the open statements
p(x) and g(x, y), respectively. Many postulates, definitions, and theorems in mathematics
involve statements that are quantified open statements. These result from the two types of
quantifiers, which are called the existential and the universal quantifiers.
      Chapter 2. Fundamentals of Logic

Statement (1) uses the existential quantifier “For some x,” which can also be expressed
                       as “For at least one x” or “There exists an x such that.” This quantifier is written in symbolic
                       form as dx. Hence the statement “For some x, p(x)” becomes 4x p(x), in symbolic form.
                           Statement (2) becomes 4x Ay g(x, y) in symbolic form. The notation 4x,y can be used
                       to abbreviate dx dy g(x, y) to dx,y g(x, y).
                           The universal quantifier is denoted by Vx and is read “For all x,” “For any x,” “For each
                       x,” or “For every x.” “For all x, y,” “For any x, y,” “For every x, y,” or “For all x and y”
                       is denoted by Vx Vy, which can be abbreviated to Vx, y.
                           Taking p(x) as defined earlier and using the universal quantifier, we can change the open
                       statement p(x) into the (quantified) statement Vx p(x), a false statement.
                           If we consider the open statement r(x): “2x is an even integer” with the same universe
                       (of all integers), then the (quantified) statement Vx r(x) is a true statement. When we say
                       that Vx r(x) is true, we mean that no matter which integer (from our universe) is substituted
                       for x in r(x), the resulting statement is true. Also note that the statement 4x r(x) is a true
                       statement, whereas Vx —r(x) and 4x —r(x) are both false.
                          The variable x in each of open statements p(x) and r(x) is called a free variable (of
                       the open statement). As x varies over the universe for an open statement, the truth value
                       of the statement (that results upon the replacement of each occurrence of x) may vary.
                       For instance, in the case of p(x), we found p(5) to be false       — while p(6) turns out to be
                       a true statement. The open statement r(x), however, becomes a true statement for every
                       replacement (for x) taken from the universe of all integers. In contrast to the open statement
                       p(x) the statement 4x p(x) has a fixed truth value—namely, true. And in the symbolic
                       representation 4x p(x) the variable x is said to be a bound variable       — it is bound by the
                       existential quantifier 4. This is also the case for the statements Wx r(x) and Vx ~r(x), where
                       in each case the variable x is bound by the universal quantifier V.
                           For the open statement g(x, y) we have two free variables, each of which is bound by
                       the quantifier A in either of the statements dx Ay g(x, y) or dx,y q(x, y).

The following example shows how these new ideas about quantifiers can be used in
                       conjunction with the logical connectives.

Here the universe comprises all real numbers. The open statements p(x), g(x), r(x), and
EXAMPLE 2.36
                       s(x) are given by
                                              p(x):    x >=O0           r(x):      x*-—3x-4=0
                                              g(x):    x? >0            s(x):      x7 -3>0.
                           Then the following statements are true.

1)                                    dx [p(x) Ar(x)]

This follows because the real number 4, for example, is a member of the universe and is
                       such that both of the statements p(4) and r(4) are true.

2)                                    Vx [p(x) > ¢(x)]
                           If we replace x in p(x) by a negative real number a, then p(a) is false, but p(a) >    g(a)
                       is true regardless of the truth value of g(a). Replacing x in p(x) by a nonnegative real
                       number », we find that p(b) and g(b) are both true, as is p(b) — g(b). Consequently,
                       p(x) — q(x) is true for all replacements x taken from the universe of all real numbers, and
                       the (quantified) statement Vx [ p(x) > q¢(x)] is true.
                           This statement may be translated into any of the following:

a) For every real number x, if x > 0, then x? > 0.
                                                                                   2.4 The Use of Quantifiers   89

b) Every nonnegative real number has a nonnegative square.
                  c) The square of any nonnegative real number is a nonnegative real number.
                  d) All nonnegative real numbers have nonnegative squares.

Also, the statement 4x [ p(x) > q(x)] is true.

The next statements we examine are false.

1’)                                      Vx [g(x) >      s(x)]

We want to show that the statement is false, so we need exhibit only one counterexample —
               that is, one value of x for which q(x) > s(x) is false—rather than prove something
               for all x as we did for statement (2). Replacing x by 1, we find that g(1) is true and
               s(1) is false. Therefore g(1) >         s(1) is false, and consequently the (quantified) statement
               Vx [¢(x) > s(x)] is false. [Note that x = 1 does not produce the only counterexample:
               Every real number a between —/3 and 3 will make g(a) true and s(a) false.]

2’)                                          Vx [r(x) V s(x)]

Here there are many values for x, such as 1, 5s1 —3, and 0, that produce counterexamples.
               Upon changing quantifiers, however, we find that the statement 4x [r(x) V s(x)] is true.

3)                                       Vx [r(x) > p(x)]
               The real number —1 is a solution of the equation x7 — 3x — 4 = 0, so r(—1) is true while
               p(—1) 1s false. Therefore the choice of —1 provides the unique counterexample we need
               to show that this (quantified) statement is false.
                  Statement (3’) may be translated into either of the following:
                  a) For every real number x, if x7 — 3x — 4 = 0, then x > 0.
                 b) For every real number x, if x is a solution of the equation x? — 3x — 4 = 0, then
                    x > 0.

Now we make the following observations. Let p(x) denote any open statement (in the
               variable x) with a prescribed nonempty universe (that is, the universe contains at least one
               member). Then if Vx p(x) is true, so is Sx p(x), or

Vx p(x) > dx p(x).

When we write Vx p(x) > dx p(x) we are saying that the implication Vx p(x) >
               dx p(x) is a logical implication   — that is, dx p(x) is true whenever Vx p(x) is true. Also,
               we realize that the hypothesis of this implication is the quantified statement Vx p(x), and
               the conclusion is dx p(x), another quantified statement. On the other hand, it does not
               follow that if dx p(x) is true, then Vx p(x) must be true. Hence 4x p(x) does not logically
               imply Vx p(x), in general.

Our next example brings out the fact that the quantification of an open statement may
               not be as explicit as we might prefer.

a) Let us consider the universe of all real numbers and examine the sentences:
EXAMPLE 2.37
                        1) If a number is rational, then it is a real number.
                        2) If x is rational, then x is real.
90   Chapter 2 Fundamentals of Logic

We should agree that these sentences convey the same information. But we should
                           also question whether the sentences are statements or open statements. In the case
                           of sentence (2) we at least have the presence of the variable x. But neither sentence
                           contains an expression such as “For all,” or “For every,” or “For each.” Our one and
                           only clue to indicate that we are dealing with universally quantified statements here is
                           the presence of the indefinite article ‘‘a” in the first sentence. In situations like these
                           the use of the universal quantifier is implicit as opposed to explicit.
                               If we let p(x), g(x) be the open statements

p(x):   x 1s arational number           q(x):      x is areal number,

then we must recognize the fact that both of the given sentences are somewhat informal
                           ways of expressing the quantified statement

Vx [p(x) > q(x)].
                        b) For the universe of all triangles in the plane, the sentence
                                  “An equilateral triangle has three angles of 60°, and conversely.”

provides another instance of implicit quantification. Here the indefinite article “An” is
                           the only indication that we might be able to express this sentence as a statement with
                           a universal quantifier. If the open statements

e(t):   Triangle ¢ is equilateral.

a(t):   Triangle t has three angles of 60°.

are defined for this universe, then the given sentence can be written in the explicit
                           quantified form

Vt [e(t) < a(t)].

c) In the typical trigonometry textbook one often comes across the trigonometric identity

sin?x + cos’ x = 1.

This identify contains no explicit quantification, and the reader must understand or be
                           told that it is defined for all real numbers x. When the universe of all real numbers is
                           specified (or at least understood), then the identity can be expressed by the (explicitly)
                           quantified statement

Vx [sin x + cos? x = 1].

d) Finally, consider the universe of all positive integers and the sentence

“The integer 41 is equal to the sum of two perfect squares.”

Here we have one more example where the quantification is implicit         — but this
                           time the quantification is existential. We may express the result here in a more formal
                           (and symbolic) manner as

dm dn [41 = m? +n7].

The next example demonstrates that the truth value of a quantified statement may depend
                      on the universe prescribed.
                                                                                 2.4 The Use of Quantifiers   91

EXAMPLE 2.38 |   Consider the open statement p(x):        x? > 1.

1) If the universe consists of all positive integers, then the quantified statement Vx p(x)
                       is true.
                    2) For the universe of all positive real numbers, however, the same quantified state-
                       ment Vx p(x) is false. The positive real number 1/2 provides one of many possible
                       counterexamples.

Yet for either universe, the quantified statement 4x p(x) is true.

One use of quantifiers in a computer science setting is illustrated in the following
                 example.

In the following program segment, n is an integer variable and the variable A is an array
EXAMPLE 2.39
                 A[1], A[2], ..., A[20] of 20 integer values.

forn       :=1
                                                                  to 20 do
                                                         A[n]      :=n*n-n

The following statements about the array A can be represented in quantified form, where
                 the universe consists of all integers from 1 to 20, inclusive.
                    1) Every entry in the array is nonnegative:

Vn (A[n] > 0).

2) There exist two consecutive entries in A where the larger entry is twice the smaller:

da (A[n + 1] = 2A[n]).

3) The entries in the array are sorted in (strictly) ascending order:

Wn [1 <n < 19) = (A[n] < A[n + 1))].

Our last statement requires the use of two integer variables m, n.
                    4) The entries in the array are distinct:

Vin Wn [(m #n) > (A[m] # A[n])], — or
                                             Vm,n[(m      <n) > (A[m] # A[n])].

Before continuing, we summarize and somewhat            extend, in Table 2.21, what we have
                 learned about quantifiers.
                    The results in Table 2.21 may appear to involve only one open statement. However, we
                 should realize that the open statement p(x) in the table may stand for a conjunction of open
                 statements, such as g(x) A r(x), or an implication of open statements, such as s(x) > f(x).
                 If, for example, we want to know when the statement 4x [s(x) — t(x)] is true, then we
                 look at the table for dx p(x) and use the information provided there. The table tells us that
                 Ax (s(x) > t(x)] is true when s(a) — f(a) is true for some (at least one) a in the prescribed
                 universe.
                    We will look further into quantified statements involving more than one open statement.
                 Before doing so, however, we need to examine the following definition. This definition is
                 comparable to Definitions 2.2 and 2.4 where we defined the ideas of logically equivalent
                 statements and logical implication. It settles the same types of questions for open statements.
92          Chapter 2 Fundamentals of Logic

Table 2.21

Statement                When Is It True?                            When Is It False?

dx p(x)           For some (at least one) a in               For every a in the universe,
                                                the universe, p(a) is true.                p(a) is false.

Vx p(x)           For every replacement a from               There is at least one replacement
                                                the universe, p(qa) is true.               a from the universe for which
                                                                                           p(a) is false.
                              dx ap(x)          For at least one choice a in               For every replacement a in the
                                                the universe, p(a) is false, so            universe, p(a) is true.
                                                its negation —p(a) is true.

Vx ap(x)          For every replacement a from               There is at least one replacement
                                                the universe, p(a@) is false and           a from the universe for which
                                                its negation —p(q) is true.                —p(a) is false and p(a) is true.

Definition 2.6         Let p(x), g(x) be open statements defined for a given universe.
                               The open statements p(x) and g(x) are called (logically) equivalent, and we write
                            Vx [p(x) <> q(x)] when the biconditional p(a) — g(a) is true for each replacement a
                            from the universe (that is, p(a) <> q(a) for each a in the universe). If the implication
                            p(a) — q(a) is true for each a in the universe (that is, p(a) > g(a) for each a in the
                            universe), then we write Vx [p(x) => q(x)] and say that p(x) logically implies q(x).

For the universe of all triangles in the plane, let p(x), g(x) denote the open statements

p(x):    x 1s equiangular           q(x):      x is equilateral.

Then for every particular triangle a (a replacement for x) we know that p(a) < q(a) is true
                            (that is, p(a) <> q(a), for every triangle in the plane). Consequently, Vx [p(x) <> g(x)].
                                Observe that here and, in general, Vx [p(x) <> q(x)] if and only if Vx [p(x) > q(x)]
                            and Vx [g(x) > p(x)].
                                We also realize that a definition similar to Definition 2.6 can be given for two open
                            statements that involve two or more variables.

Now     we take another look at the logical equivalence of statements (not open state-
                            ments) as we examine the converse, inverse, and contrapositive of a statement of the form
                            Vx [p(x) > q(x)].

Definition 2.7         For open statements p(x), g(x) — defined for a prescribed universe — and the universally
                            quantified statement Vx [ p(x) > q(x)], we define:

1) The contrapositive of Vx [p(x) > q(x)] to be Vx [-=¢g(x) ~ —p(x)].
                                2) The converse of Vx [p(x) — q(x)] to be Vx [g(x) > p(x)].
                                3) The inverse of Vx [ p(x) > q(x)] to be Vx [> p(x) > -g(x)].

The following two examples illustrate Definition 2.7.
                                                                                      2.4 The Use of Quantifiers        93

For the universe of all quadrilaterals in the plane let s(x) and e(x) denote the open statements
EXAMPLE 2.40
                                     s(x):      x 18s a Square           e(x):        x is equilateral.

a) The statement

Vx [s(x) > e(x)]

is a true statement and is logically equivalent to its contrapositive

Vx [-e(x) > 75(x)]
                     because [s(a) >         e(a)] <>} [-e(a) >       —s(a)] for each replacement a. Hence

Vx [s(x) > e(x)] <=        Vx [me(x) > a5(x)].

b) The statement

Vx [e(x) >      s(x)]

is a false statement and is the converse of the true statement

Vx [s(x) >      e(x)].

The false statement

Vx [>s(x) > -e(x)]

is the inverse of the given statement Wx [s(x) >                 e(x)].
                         Since [e(a) > s(a)] <> [—s(a) — -e(a)] for each specific quadrilateral a, we
                     find that the converse and inverse are logically equivalent — that is,

Vx [e(x) > s(x)] <> Vx [As (x) > me(x)].

Here p(x) and g(x) are the open statements
EXAMPLE 2.41
                                               p(x):       |x| >3             q(x):       x >3

and the universe consists of all real numbers.
                  a) The statement Vx [p(x) — qg(x)] is a false statement. For example, if x = —5, then
                     p(—5)   is true while g(—5)          is false. Consequently, p(—5) >           qg(—5) is false, and so
                     is Vx [p(x) > q(x)].
                 b) We can express the converse of the given statement [in part (a)] as follows:

Every real number greater than 3 has magnitude
                                                       (or, absolute value) greater than 3.

In symbolic form this true statement is written Vx [g(x) >                   p(x)].
                  c) The inverse of the given statement is also a true statement. In symbolic form we have
                     Vx [=p(x) — —q(x)], which can be expressed in words by
                                  If the magnitude of a real number is less than or equal to 3,
                                             then the number itself is less than or equal to 3.

And this is logically equivalent to the (converse) statement given in part (b).
                 d) Here the contrapositive of the statement in part (a) is given by Vx [~q(x) > —p(x)].
                    This false statement is logically equivalent to Vx [p(x) — q(x)] and can be expressed
94         Chapter 2 Fundamentals of Logic

as follows:

If a real number is less than or equal to 3, then so is its magnitude.

e) Together with p(x) and g(x) as above, consider the open statement

r(x):   x <-—3,

which is also defined for the universe of all real numbers. The following four state-
                                  ments are all true:
                                                     Statement:          Vx [p(x) >   (r(x) V g(x))]

Contrapositive:     Wx [>(r(x) V g(x)) > >p(x)]
                                                     Converse:           Vx [(r(x) V g(x) > p(®)]
                                                     Inverse:            Vx [=p(x) > 7A(r(x) V g(x))]

In this case (because the statement and its converse are both true) we find that the
                                  statement Vx [p(x) = (r(x) V g(x))] is true.

Now we use the results of Table 2.21 once again as we examine the next example.

Here the universe consists of all the integers, and the open statements r(x), s(x) are
     EXAMPLE 2.42
                           given by

r(x):      2x+1=5            s(x):    x? =9.

We see that the statement Ax [7 (x) A s(x)] is false because there is no one integer a such
                            that 2a + 1 = 5 and a* = 9. However,       there is an integer b (= 2) such that 2b4    1 =5,
                            and there is a second integer c (= 3 or --3) such that c* = 9. Therefore the statement
                            dx r(x) A Ax s(x) is true. Consequently, the existential quantifier dx does not distribute
                            over the logical connective A. This one counterexample is enough to show that

dx [r(x) A s(x)] + [Ax r(x) A Ax s(x)],
                            where <# is read “is not logically equivalent to.” It also demonstrates that

[dx r(x) A Ax s(x)} A Ax [r(x) A s(x)],

where # is read “does not logically imply.” So the statement

[Ax r(x) A dx s(x)] >     Ax [r(x) Asx]

is not a tautology.
                               What, however, can we say about the converse of a quantified statement of this form?
                           At this point we present a general argument for any (arbitrary) open statements p(x), g(x)
                           and any (arbitrary) prescribed universe.
                               Examining the statement

dx [p(x) Aqg(x)] > [dx p(x) A Ax q(x),
                            we find that when the hypothesis 4x [p(x) A q(x)] is true, there is at least one element c
                            in the universe for which the statement p(c) A q(c) is true. By the Rule of Conjunctive
                            Simplification (see Section 2.3), [p(c) A g(c)] => p(c). From the truth of p(c) we have the
                            true statement dx p(x). Similarly we obtain 4x g(x), another true statement. So dx p(x) A
                                                                             2.4 The Use of Quantifiers          95

dx g(x) is a true statement. Since dx p(x) A dx g(x) is true whenever Ax [ p(x) A g(x)]
               is true, it follows that

dx [p(x) A q(x)] => [Ax p(x) A Ax q(x)].

Arguments similar to the one for Example 2.42 provide the logical equivalences and
               logical implications listed in Table 2.22. In addition to those listed in Table 2.22 many other
               logical equivalences and logical implications can be derived.

Table 2.22   Logical Equivalences and Logical Implications for Quantified Statements in One
                   Variable

For a prescribed universe and any open statements p(x), g(x) in the variable x:
                                            Ax [p(x) A q(x)] = [Ax px) A Ax qQ@)]
                                            Ax [p(x) V q(x)] <> [Ax p(x) Vv Ax q(x)]
                                            Vx [p(x) A q(x)] = [Vx p(x) A Vx gQ)]
                                            [Vx p(x) V Wx q(x) => Wx [p@) Vv q(x)]

Our next example lists several of these and demonstrates how two of them are verified.

Let p(x), g(x), and r(x) denote open statements for a given universe. We find the following
EXAMPLE 2.43   logical equivalences. (Many more are also possible.)

1) Vx [p(x) A (¢@) Ar(x))] = Vx [(p(®) A g(x) Ar(x)]
                      To show that this statement is a logical equivalence we proceed as follows:
                      For each a in the universe, consider the statements p({a) A (q(a) Ar(a))                   and
                      (p(a) A g{a)) A r(a). By the Associative Law for A, we have

pla) A (qa) Ar(a)) =      (pla) Aqg{a)) Arta).
                           Consequently, for the open statements p(x) A (q(x) Ar(x)) and
                      (p(x) Ag(x))        A r(x), it follows that

Wx [p(x) A (q(x) Ar(x))] = Vx (Cp) Ag)                  Ar@)).

2) Ax [p(x) > q(Qx)] = Ax [>p(x) V g(x)]
                      For each c in the universe, it follows from Example 2.7 that

[p(c) > q(c)] = Imp) V go).
                      Therefore the statement 4x [p(x) > g(x)] is true (respectively, false) if and only if
                      the statement 4x [=p(x) V q(x)] is true (respectively, false), so

dx [p(x) > q(x)] =      Ax [=p() Vv q(x)].

3) Other logical equivalences that we shall often find useful include the following.
                      a)    Vx ~o p(x) &        Vx p(x)
                      b) Wx -[p(x) A q(x)] <> Vx [p(x) V mg (x)]
                       c) Wx —[p(x) Vv q(x)] <> Vx [s=p() A -q(x)]
96         Chapter 2 Fundamentals of Logic

4) The results for the logical equivalences in 3(a), (b), and (c) remain valid when all of
                                  the universal quantifiers are replaced by existential quantifiers.

The results of Tables 2.21 and 2.22 and Examples 2.42 and 2.43 will now help us with
                           a very important concept. How do we negate quantified statements that involve a single
                           variable?
                              Consider the statement Vx p(x). Its negation —namely, —[Vx p{x)]—can be stated
                           as “It is not the case that for all x, p(x) holds.” This is not a very useful remark, so we
                           consider —[Vx p(x)] further. When —[Vx p(x)] is true, then Vx p(x) 1s false, and so for
                           some replacement a from the universe —p(a) is true and dx —p(x) is true. Conversely,
                           whenever the statement 4x —p(x) is true we know that —p(b) is true for some member b of
                           the universe. Hence Vx p(x) is false and —[Vx p(x)] is true. So the statement —[Vx p(x)]
                           is true if and only if the statement 4x — p(x) is true. (Similar considerations also tell us that
                           —=[Vx p(x)] is false if and only if dx — p(x) is false.)
                               These observations lead to the following rule for negating the statement Vx p(x):

[Vx p(x)] <=> Ax —p(x).
                           In a similar way, Table 2.21       shows us that the statement 4x p(x) is true (false) precisely
                           when the statement Vx —p(x) is false (true). This observation then motivates a rule for
                           negating the statement 4x p(x):

[Ax p(x)] <> We ap(x).
                           These two rules for negation, and two others that follow from them, are given in Table 2.23
                           for convenient reference.

Table 2.23 Rules for Negating Statements with One Quanti-
                                               fier

[Wx p(x)] <=> dx apr)
                                                    [Ax p(x)] <= Vx =p(x)
                                                    [Vx > p(x)] <= Ax -> p(x) <= Ax p(x)
                                                    —[dx sp(x)]        <=   Ve -~ p(x)    <&    Vx p(x)

We use the rules for negating quantified statements in the following example.

Here we find the negation of two statements, where the universe comprises all of the integers.
     EXAMPLE 2.44
                               1) Let p(x) and q(x) be given by
                                                      p(x):      x isodd          g(x):        x? — 1is even.
                                     The statement “If x is odd, then x?—1 is even” can                         be   symbolized   as
                                  Vx [ p(x) > q(x)]. (This is a true statement.)
                                     The negation of this statement is determined as follows:

—[Vx (p(x) > q(x))] =           Ax [-(p(x) > g(x))]
                                                 <> Ax [>(—p(x) V g(x))] <> Ax [=> p(x) A mq (x)]
                                                 <> dx [p(x) A-¢(x)]
                                      In words, the negation says, “There exists an integer x such that x is odd and
                                  x? — 1] is odd (that is, not even).” (This statement is false.)
                                                                            2.4 The Use of Quantifiers              97

2) As in Example 2.42, let r(x) and s(x) be the open statements

r(x):     2x+1=5           S(x):      x? =9,

The quantified statement dx [r(x) A s(x)] is false because it asserts the existence
                     of at least one integer a such that 2a + 1 = 5 (a = 2) and a* = 9 (a = 3 or —3).
                     Consequently, its negation

[Ax (r(x) A s(x))] <> Ve [A () A s(x))] <> Ve [ar (x) V ms (x)]
                     is true. This negation may be given in words as “For every integer x, 2x +- 1 #5 or
                     x 2 FY, 8

Because a mathematical statement may involve more than one quantifier, we continue
               this section by offering some examples and making some observations on these types of
               statements.

Here we have two real variables x, y, so the universe consists of all real numbers.              The
EXAMPLE 2.45
               commutative law for the addition of real numbers may be expressed by

Vx Vy (x+y    =ytx).

This statement may also be given as

Vy Vx (x+y =y+Xx).
                  Likewise, in the case of the multiplication of real numbers, we may write

Vx Vy (xy = yx)     or   Vy Vx (xy = yx).

These two examples suggest the following general result. If p(x, y) is an open statement
               in the two variables x, y (with either a prescribed universe for both x and y or one prescribed
               universe for x and a second for y), then the statements Vx Vy p(x, y) and Vy Wx p(x, y)
               are logically equivalent  — that is, the statement Vx Vy p(x, y) is true (respectively, false)
               if and only if the statement Vy Vx p(x, y) is true (respectively, false). Hence

Vx Vy p(x, y) <> Vy Ve p(x, y).

When dealing with the associative law for the addition of real numbers, we find that for all
EXAMPLE 2.46
               real numbers x, y, and z,

xt(y+z)=(+y)+z.
               Using universal quantifiers (with the universe of all real numbers), we may express this by

Ve Vy Ve[xt+(vtz2)=@+y)4+2z)                 or   Vy We Ve [x + (942) = (4+              y) +2].
               In fact, there are 3! = 6 ways to order these three universal quantifiers, and all six of these
               quantified statements are logically equivalent to one another.
                  This is actually true for all open statements p(x, y, z), and to shorten the notation, one
               may write, for example,

Vx, v,z   p(x. y,z) <=> Vy, x, 2 p(x, y, 2) =      Vx, zy      p(x, y, 2),

describing the logical equivalence for three of the six statements.
98         Chapter 2 Fundamentals of Logic

In Examples 2.45 and 2.46 we encountered quantified statements with two and three
                           bound variables — each such variable bound by a universal quantifier. Our next example
                           examines a situation in which there are two bound variables — and this time each of these
                           variables is bound by an existential quantifier.

For the universe of all integers, consider the true statement “There exist integers x, y such
     EXAMPLE 2.47
                           that x + y = 6.” We may represent this in symbolic form by

dx dy (x + y = 6).
                           If we let p(x, y) denote the open statement “x + y = 6,” then an equivalent statement can
                           be given by dy Ax p(x, y).
                               In general, for any open statement p(x, y) and universe(s) prescribed for the vari-
                           ables x, y,

dx Ay p(x, y) <=> Ay Ax pt, y).
                               Similar results follow for statements involving three or more such quantifiers.

When a    statement involves both existential and universal quantifiers, however, we must
                           be careful about the order in which the quantifiers are written. Example 2.48 illustrates this
                           case.

We restrict ourselves here to the universe of all integers and let p(x, y) denote the open
     EXAMPLE 2.48
                           statement “x + y = 17.”

1) The statement

Vx dy p(x, y)
                                  says that “For every integer x, there exists an integer y such that x + y = 17.” (We
                                  read the quantifiers from left to right.)
                                      This statement is true; once we select any x, the integer y = 17 — x does exist
                                  and x + y= x+(17 — x) = 17. But we realize that each value of x gives rise to a
                                  different value of y.
                               2) Now consider the statement

dy Vx p(x, y).
                                      This statement is read “There exists an integer y so that for all integers x, x + y =
                                  17.” This statement is false. Once an integer y is selected, the only value that x can
                                  have (and still satisfy x + y = 17)is 17 — y.
                                      If the statement Jy Vx p(x, y) were true, then every integer (x) would equal
                                  17 — y (for some one fixed y). This says, in effect, that all integers are equal!
                                      Consequently, the statements Vx dy p(x, y) and dy Vx p(x, y) are generally not
                                  logically equivalent.

Translating mathematical statements  — be they postulates, definitions, or theorems —
                           into symbolic form can be helpful for two important reasons.

1) Doing so forces us to be very careful and precise about the meanings of statements,
                                  the meanings of phrases such as “For all x” and “There exists an x,” and the order in
                                  which such phrases appear.
                                                                                         2.4 The Use of Quantifiers           99

2) After we translate a mathematical statement into symbolic form, the rules we have
                      learned should then apply when we want to determine such related statements as the
                      negation or, if appropriate, the contrapositive, converse, or inverse.

Our last two examples illustrate this, and in so doing, extend the results in Table 2.23.

Let p(x, y), g(x, y), and r(x, y) represent three open statements, with replacements for
EXAMPLE 2.49
               the variables x, y chosen from some prescribed universe(s). What is the negation of the
               following statement?

Vx dy [(p(x, y) A g(x, y)) > r(x, y)]
               We find that

—[Wx dy (pt, y) Ag(x, y)) > rx, y)]I
                                         <> dx [-dy (Cp, y) Ag,                     y) > rG, y)]]
                                          =>   dx   Vy   —[(p(,    y)   A   q(x,   y))   >   r(x,    y)]

= dx Vy “[-[p@, y) A(x, yI Vr, y)]
                                         <> Ax Vy [--[ p(x, y) Ag,                   yA      ar, y)]
                                         <> dx Vy ((p@, y) Ag(x, y)) Avr,                           y)].
                  Now suppose that we are trying to establish the validity of an argument (or a mathematical
               theorem) for which

Vx dy [((p(@, y) Ags, y)) > r(x, y)]

is the conclusion. Should we want to try to prove the result by the method of Proof by
               Contradiction, we would assume as an additional premise the negation of this conclusion.
               Consequently, our additional premise would be the statement

dx Vy [(p(x, y) A g(x, y)) Amr(x, y)].

Finally, we consider how           to negate the definition of limit, a fundamental                 concept in
               calculus.

In calculus, one studies the properties of real-valued functions of a real variable. (Functions
EXAMPLE 2.50
               will be examined in Chapter 5 of this text.) Among these properties is the existence of limits,
               and one finds the following definition: Let J be an open interval’ containing the real number
               a and suppose the function f is defined throughout J, except possibly at a. We say that f
               has the limit L as x approaches a, and write lim,_,, f(x) = L, if (and only if) for every
               € > Othere exists ad > Oso that, forallx in 7,(O < |x — a| < 5) — (| f(x) — L| < €). This
               can be expressed in symbolic form as

lim f(x)    =L<esVWe >0             4d >0    Vx [(0 < |x —al <8) > ([ f(x) —L|                   <€)].

‘The concept of an open interval is defined at the end of Section 3.1.
100                    Chapter 2 Fundamentals of Logic

[Here the universe comprises the real numbers in the open interval /, except possibly a.
                                          Also, the quantifiers Ve > 0 and 35 > 0 now contain some restrictive information.] Then,
                                          to negate this definition, we do the following (in which certain steps have been combined):

lim f(x) #L
                                                         <>      [Ve > 0 A6>0         Vx [(0< |x — a] < 8)> ([f(x) — L|
                                                                                                                      < €)]]
                                                         <>     de>0       V5 >0O   Ax -[(0
                                                                                         < |x —a| <8) > (f(x)
                                                                                                          — L| <€)]
                                                          =     de>0       VS >0    Ax -[-(0
                                                                                         < |x —a| < 8) Vv (| f(x) — L| < €)]
                                                         <>     de >0      V5 > 0 Ax [—--(0
                                                                                        < |x —a| <8) A>                              f(x) — L| < )]
                                                          <=> de>0         VS >0    Ax [(0<      |x —al      < dS) A (f(x)
                                                                                                                         —L| > )]
                                              Translating into words, we find that lim,_., f(x) # L if (and only if) there exists a
                                          positive (real) number € such that for every positive (real) number 4, there is an x in J such
                                          that0 < |x — a| < 6 (that is, x # a and its distance froma is less than 8) but | f(x) — L| >€
                                          [that is, the value of f(x) differs from L by at least €].

ee                  i(x):      x is an isosceles triangle
                                   a cs
                                                    se                                         p(x):       x has an interior angle that exceeds 180°
1. Let p(x), g(x) denote the following open statements.                                       q(x):       x is a quadrilateral
                   p(x):    x <3            q(x):        x+1isodd                               r(x):      x is a rectangle
If the universe consists of all integers, what are the truth values                             s(x):      x is a square
of the following statements?                                                                    t(x):     x is a triangle
      a) q(1)                      b) ~p(3)                   ce) pT) V aq?)          Translate each of the following statements into an English sen-
      d) p(3) Aq(4)                 e) -(p(—4) v q(—3))                               tence, and determine whether the statement is true or false.
      f) -p(—4) A >q(-3)                                                                  a) Wx [q(x) ¥ t(x)]                        b) Wx [i(x) > e(x)]
2. Let p(x), g(x) be defined as in Exercise 1. Let r(x) be the                           c) Ax [t(x) A p@®)]                        d) Vx [(a(x) A t(x)) © e(x)]
open statement “x > 0.” Once again the universe comprises all                             e) Ax [g(x) Arr(x)]                        f) Ax [r(x) A -s(x)]

integers.                                                                                 g) Vx A(x) > e(x))                         sh) Wx [2(x) > 7px]
      a)    Determine the truth values of the following statements.                        i) Wx [s(x) ©            (a(x) A A(X))]
              i) p(3) Vv [g(3) Vv mr (3)]
             i) p2) > [gQ) > rQ]                                                          J) Vx [t(4) > (a(x) @ A(x))]
            iii)     [p(2) Aq(2)] > r(2)                                               5. Professor Carlson’s class in mechanics is comprised of 29
            iv)      p(0) >     [-9(-1) & r(1)]                                       students of which exactly

b) Determine all values of x for which                                              1) three physics majors are juniors;
      [p(x) A q(x)] A r(x) results in a true statement.                                   2) two electrical engineering majors are juniors;
  3. Let p(x) be the open statement “x? = 2x,” where the                                  3) four mathematics majors are juniors;
universe comprises all integers. Determine whether each of
                                                                                          4) twelve physics majors are seniors;
the following statements is true or false.
                                                                                          5) four electrical engineering majors are seniors;
      a) p(0)                      b) p()                     ¢) p(2)                     6)            lectrical        envineeri       ;         d         d
      d) p(—2)                     e) Ax p(x)                 f) Wx p(x)                       i two electrical engineering
                                                                                                                   Binecting majors
                                                                                                                                may are          graduate students;

4. Consider the universe of all polygons with three or four                                              ;
.                           .                         .    .                             7) two mathematics majors are graduate students.
sides, and define the following open statements for this uni-
verse.                                                                                Consider the following open statements.
                    a(x):     all interior angles of x are equal                                  c(x):      Student x is in the class (that is,
                    e(x):     x is an equilateral triangle                                                   Professor Carlson’s mechanics class
                    h(x):     all sides of x are equal                                                       as already described).
                                                                                                       2.4 The Use of Quantifiers                       101

j(x):     Student x is a junior.
                                                                                 i) Wx [r(x) > p(x)]                    ii) Vx [sQx) > qQ@)I
      s(x):    Student x is a senior.                                          iii)   Vx [s(x) > 7t(x))                 iv)         Ax [s(x) Aamr(x)]
     g(x):     Student x is a graduate student.                           d) Provide a counterexample for each false statement in
                                                                          part (c).
     p(x):     Student x is a physics major.
                                                                      8. Let      p(x),     q(x),     and    r(x)      denote         the   following   open
      e(x):    Student x is an electrical engineering major.
                                                                     statements.
     m(x):     Student x is a mathematics major.
                                                                                              p(x):         x? -8x+15=0
Write each of the following statements in terms of quantifiers
and the open statements c(x), j(x), s(x), g(x), p(x), e(x), and                               q(x):         x is odd
m(x), and determine whether the given statement is true or false.                             r(x):         x>0
Here the universe comprises all of the 12,500 students enrolled
at the university where Professor Carlson teaches. Furthermore,      For the universe of all integers, determine the truth or falsity of
at this university each student has only one major.                  each of the following statements. If a statement is false, give a
                                                                     counterexample.
    a) There is amathematics major in the class who is a junior.
                                                                          a) Wx [p(x) > g(x)}                          b) Vx [g(x) > p(x)]
    b) There is a senior in the class who is not a mathematics
    major.                                                                c) dx [p(x) > g(x)J                          d) Ax [¢g(x) > p(x)]

c) Every student in the class is majoring in mathematics or           e) Ax [r(x) > p(x)]       f) Vx [-¢(x) > ap(x)]
    physics.                                                             g) dx [p() > (g(x) Ar@))]
    d) No graduate student in the class is a physics major.              h) Vx [(p(x) V q(x) > r(x)]
    e) Every senior in the class is majoring in either physics or     9. Let p(x), g(x), and r(x) be the following open statements.
    electrical engineering.
                                                                                              p(x):         x? —7x+10=0
6. Let p(x, y), g(x, y) denote the following open statements.
                         xw>y                                                                 q(x):         x? —2x —3=0
         P(X, y):                    g(x,y)         x+2<y
                                                                                              r(x):         x <0
If the universe for each of x, y consists of all real numbers,
determine the truth value for each of the following statements.          a) Determine the truth or falsity of the following state-
                                                                         ments, where the universe is all integers. If a statement is
    a) p(2, 4)                          b) g(, z)
                                                                         false, provide a counterexample or explanation.

i) Vx [p(x) > 7r(x)]                         li) Vx [q(x) > r(x)]
    e) p(2, 2) > gl, 1)                 f) pl, 2) + >¢(1,
                                                        2)                     iii)   Ax (g(x) > r(x)                         iv)     Ax [p(x) > r(x)]
7. For the universe of all integers, let p(x), g(x), r(x), s(x),
                                                                         b) Find the answers to part (a) when the universe consists
and t(x) be the following open statements.
                                                                         of all positive integers.
               p(x):     x >O0
                                                                         c) Find the answers to part (a) when the universe contains
               q(x):     x is even                                       only the integers 2 and 5.
                r(x):    x is a perfect square                       10. For the following program segment, m and n are integer
                s(x):    x is (exactly) divisible by 4               variables. The variable A is a two-dimensional array A[1, 1},
                                                                     A[], 2],..., A[], 20],..., A[10, 1], ..., A[10, 20], with 10
                 t(x):   x is (exactly) divisible by 5
                                                                     rows (indexed from | to 10) and 20 columns (indexed from 1
    a) Write the following statements in symbolic form.              to 20).
          i) At least one integer is even.
                                                                                            form       :=1to10do
        ii) There exists a positive integer that is even.
                                                                                                forn:=1to20do
       iii) If x is even, then x is not divisible by 5.
                                                                                                      Alm,n)        :=m+3%*n
       iv) No even integer is divisible by 5.
         v) There exists an even integer divisible by 5.
       vi) If x is even and x is a perfect square, then x is         Write the following statements in symbolic form. (The universe
              divisible by 4.                                        for the variable m contains only the integers from | to 10 in-
                                                                     clusive; for         the universe consists of the integers from | to 20
    b) Determine whether each of the six statements in
                                                                     inclusive.)
    part (a) is true or false. For each false statement, provide a
    counterexample.                                                      a) All entries of A are positive.

c) Express each of the following symbolic representations            b) All entries of A are positive and less than or equal to 70.
    in words.                                                             c) Some of the entries of A exceed 60.
102                Chapter 2. Fundamentals of Logic

d) The entries in each row of A are sorted into (strictly)            f) Vn [>p(x) > -q(n))
      ascending order.                                                      g) Va [ p(n) is sufficient for g(n)]
      e) The entries in each column of A are sorted into (strictly)     15. For each of the following pairs of statements determine
      ascending order.                                                  whether the proposed negation is correct. If correct, determine
      f) The entries in the first three rows of A are distinct.         which is true: the original statement or the proposed negation.
11. Identify the bound variables and the free variables in each         If the proposed negation is wrong, write a correct version of the
of the following expressions (or statements). In both cases the         negation and then determine whether the original statement or
universe comprises all real numbers.                                    your corrected version of the negation is true.

a) Vy Az [cos(x + y) = sin(z — x)]                                    a) Statement: For all real numbers x, y, if x? > y*, then
                                                                            x>y.
      b) dx dy [x? — y* = 2]
                                                                            Proposed negation: There exist real numbers x, y such that
12. a) Let p(x, y) denote the open statement “x divides y,”                 x? > y* butx < y.
    where the universe for each of the variables x, y comprises
                                                                            b) Statement: There exist real numbers x, y such thatx and
    all integers. (In this context “divides” means “exactly di-
                                                                            y are rational but x + y is irrational.
    vides” or “divides evenly.”) Determine the truth value of
                                                                            Proposed   negation:   For all real numbers x, y, if x + y is
    each of the following statements; if a quantified statement
                                                                            rational, then each of x, y is rational.
    is false, provide an explanation or a counterexample.
                                                                            c) Statement: For all real numbers x, if x is not 0, then x
                 i) p3, 7)              ii) p(3, 27)                        has a multiplicative inverse.
            iii)    Vy pl, y)           iv)   Vx p(x, 9)
                                                                            Proposed negation: There exists a nonzero real number that
              v)    Vx p(x, x)          vi)   Vy dx p(x, y)
                                                                            does not have a multiplicative inverse.
           vii)     Ay Vx p(x, y)
                                                                            d) Statement: There exist odd integers whose product is
          viii) Vx Vy (p(x, y) A p(y. x)) > & = yD]
                                                                            odd.
      b) Determine which of the eight statements in part (a) will
                                                                            Proposed negation: The product of any two odd integers is
      change in truth value if the universe for each of the variables
                                                                            odd.
      x, y were restricted to just the positive integers.
                                                                        16. Write the negation of each of the following statements as
      c) Determine the truth value of each of the following state-
                                                                        an English sentence   — without symbolic notation. (Here the
      ments. If the statement is false, provide an explanation or
                                                                        universe consists of all the students at the university where
      a counterexample. [The universe for each of x, y is as in
                                                                        Professor Lenhart teaches.)
      part (b).]
                                                                            a) Every student in Professor Lenhart’s C++            class   is
            i) Vx Ay p(x, y)            ii)   Vy Ax p(x, y)
                                                                            majoring in computer science or mathematics.
          iii)     Ax Vy p(x, y)       iv)    Ay Vx p(x, y)
                                                                            b) At least one student in Professor Lenhart’s C++ class is
13. Suppose that p(x, y) is an open statement where the uni-
                                                                            a history major.
verse for each of x, y consists of only three integers: 2, 3, 5.
Then the quantified statement Sy p(2, y) is logically equiva-           17. Write the negation of each of the following true statements.
lent to p(2, 2) Vv p(2, 3) V p@, 5). The quantified statement           For parts (a) and (b) the universe consists of all integers; for
Ax Vy p(x, y) is logically equivalent to [p(2, 2) A p(2, 3) A           parts (c) and (d) the universe comprises all real numbers.
P(2, 5)) Vv [p(3, 2) A p3, 3) A pG, SIV [PG, 2) A pO, 3)                    a) For all integers n, if n is not (exactly) divisible by 2,
A p(5, 5)]. Use conjunctions and/or disjunctions to express the             then 7 is odd.
following statements without quantifiers.
                                                                            b) If k, m,n are any integers where k — m and m — n are
      a) Vx p(x,3)           b) Ax Ay p(x, y) — e) Vy Ax pt, y)             odd, then k — w is even.
14, Let p(7), g(n) represent the open statements                             c) If x is a real number where x? > 16, then x < —4 or
                 p(n):   nis odd          q(n):    nis odd                  x > 4,
for the universe of all integers. Which of the following state-             d)   Forall real numbers
                                                                                                 x, if |x — 3| < 7,then—4        < x < 10.
ments are logically equivalent to each other?                           18. Negate and simplify each of the following.
      a) If the square of an integer 1s odd, then the integer is odd.       a) Ax [p(x) Vv q(x)]       b) Vx [p(x) A 79 (x))
      b) Wn [p(n) is necessary for g (7]                                    c) Wx [p(x) > g(x)]
      c) The square of an odd integer is odd.                               d) Ax [(p(x) V 4(x)) > r(x)]
      d) There are some integers whose squares are odd.                 19. For each of the following statements state the converse,
      e) Given an integer whose square is odd, that integer is          inverse, and contrapositive. Also determine the truth value for
      likewise odd.                                                     each given statement, as well as the truth values for its converse,
                                                                         2.5    Quantifiers, Definitions, and the Proofs of Theorems            103

inverse, and contrapositive.     (Here “divides” means      “exactly           0 + a = a for every real number a. This may be expressed in
divides.’’)                                                                    symbolic form by
    a) [The universe comprises all positive integers.]                                               dz Vala+z=z+a=a}.
    Ifm > n, then m2 > n?.
                                                                               (We agree that the universe comprises all real numbers.)
    b) [The universe comprises all integers.]
                                                                                      a)   In conjunction   with the existence of an additive iden-
    Ifa > b, then a? > b?.                                                            tity is the existence of additive inverses. Write a quantified
     c) [The universe comprises all integers.]                                        statement that expresses “Every real number has an addi-
    If m divides n and n divides p, then m divides p.                                 tive inverse.”’ (Do not use the minus sign anywhere in your
    d) [The universe consists of all real numbers. ]}                                 statement.)
    Vx [(x > 3) > (x? > 9)]                                                           b) Write a quantified statement dealing with the existence
    e) [The universe consists of all real numbers. ]}                                 of a multiplicative identity for the arithmetic of real num-
    For all real numbers x, if x* + 4x — 21 > 0, then x > 3 or                        bers.
    x<—7,                                                                             c) Write a quantified statement covering the existence of
20. Rewrite each of the following statements in the if then form.                     multiplicative inverses for the nonzero real numbers. (Do
Then write the converse, inverse, and contrapositive of your im-                      not use the exponent —1 anywhere in your statement.)
plication. For each result in parts (a) and (c) give the truth value                  d) Do the results in parts (b) and (c) change in any way
for the implication and the truth values for its converse, inverse,                   when the universe is restricted to the integers?
and contrapositive. [In part (a) “divisibility” requires a remain-
                                                                               24. Consider the quantified statement Vx dy [x + y = 17]. De-
der of 0.)
                                                                               termine whether this statement is true or false for each of the
    a) [The universe comprises all positive integers.]                         following universes: (a) the integers; (b) the positive integers;
    Divisibility by 21 is a sufficient condition for divisibility              (c) the integers for x, the positive integers for y; (d) the positive
    by 7.                                                                      integers for x, the integers for y.
    b) [The universe comprises all snakes presently slithering
    about the jungles of Asia.]                                                25. Let the universe for the variables in the following state-
    Being a cobra is a sufficient condition for a snake to be                  ments consist of all real numbers. In each case negate and sim-
    dangerous.                                                                 plify the given statement.

c) [The universe consists of all complex numbers. ]                               a) Wx Vy [(x > y)> (x ~ y > 0)]
    For every complex number z, z being real is necessary for                         b) Vx Vy [x < y)> dz (¥ <z<y)]
    27 to be real.
                                                                                      c) Vx Vy [x] = ly) > (y = £x))
21. For the following statements the universe comprises all                    26. In calculus the definition of the limit L of a sequence of
nonzero integers. Determine the truth value of each statement.
                                                                               real numbers 7), 2, 73, .. . can be given as
    a) Ax dy [xy = 1]                 b) Ax Vy [vy = 1]
                                                                                                              lim r, =L
    c) Vx Ay [xy = 1]                                                                                         NOX

if (and only if) for every € > 0 there exists a positive integer k
    d) Ax Ay [(2x + y =5) A (x — 3y = —8)]
                                                                               so that for all integers n, ifn > k then |r, - L| <e.
    e) Ax Ay [3x — y =7) A (2x + 4y = 3)]                                          In symbolic form this can be expressed as
22. Answer    Exercise   21   for the universe   of all nonzero   real
                                                                                lim r, = L<         We > 0 5k >0 Wn [(n
                                                                                                                     > k) >            |r,
                                                                                                                                         -— L| <€].
numbers.                                                                       noes

23. In the arithmetic of real numbers, there is a real num-                       Express       lim r, # L in symbolic form,
ber, namely 0, called the identity of addition because a + 0 =

2.5
  Quantifiers, Definitions, and the Proofs
                of Theorems
                                  In this section we shall combine some of the ideas we have already studied in the prior two
                                  sections. Although Section 2.3 introduced rules and methods for establishing the validity
                                  of an argument, unfortunately the arguments presented there seemed to have little to do
                                  with anything mathematical. [The rare exceptions are in Example 2.23 and the erroneous
104         Chapter 2. Fundamentals of Logic

argument in part (b) of the material preceding Example 2.26.] Most of the arguments dealt
                             with certain individuals and predicaments they were either in or about to face.
                                But now that we have learned some of the properties of quantifiers and quantified state-
                             ments, we are better equipped to handle arguments that will help us to prove mathematical
                             theorems.   Before dealing with theorems,              however,    we shall consider how   mathematical
                             definitions are traditionally presented in scientific writing.
                                 Following Example 2.3 in Section 2.1, the discussion concerned how an implication
                             might be used in place of a biconditional in everyday conversation. But in scientific writing,
                             it was noted, we should avoid any and all situations where an ambiguous interpretation
                             might come about — in particular, an implication should not be used when a biconditional
                             is intended. However, there is one major exception to that rule and it concerns the way that
                             mathematical definitions are traditionally presented in mathematics textbooks and other
                             scientific literature. Example 2.51 demonstrates this exception.

a) Let us start with the universe of all quadrilaterals in the plane and try to identify those
      EXAMPLE 2.51
                                   that are called rectangles.
                                       One person might say that

“If a quadrilateral is a rectangle then it has four equal angles.”
                                   Another individual might identify these special quadrilaterals by observing that

“If a quadrilateral has four equal angles, then it is a rectangle.”

(Here both people are making implicitly quantified statements, where the quantifier is
                                   universal.)
                                       Given the open statements

p(x):     x isarectangle              q(x):     x has four equal angles,
                                   we can express what the first person says as

Vx [p(x) > q(x)],
                                   while for the second person we would write

Vx [q(x) > p()].
                                   So which of the preceding (quantified) statements identifies or defines a rectangle?
                                   Perhaps we feel that they both do. But how can that be, since one statement is the
                                   converse of the other and, in general, the converse of an implication is not logically
                                   equivalent to the implication.
                                      Here the reader must consider what is intended    — not just what each of the two
                                   people has said, or the symbolic expressions we have written to represent these state-
                                   ments. In this situation each person is using an implication with the meaning of a
                                   biconditional. They are both intending (though not stating)

Vx [p(x)       q(x)],
                                    — that     is, each is really telling us that

“A quadrilateral is a rectangle if and only if it has four equal angles.”
                               b) Within the universe of all integers we can distinguish the even integers by means of a
                                   certain property and so we may define them as follows:

For every integer n we call n even if it is divisible by 2.
                                                   2.5   Quantifiers, Definitions, and the Proofs of Theorems   105

(By the expression “divisible by 2” we mean “exactly divisible by 2” — that is, there
                    is no remainder upon division of the dividend x by the divisor 2.)
                        If we consider the open statements

p(n):    nis an even integer             q(n):      nis divisible by 2,

then it appears that the preceding definition may be written symbolically as

Vn [g(n) > p(n)].
                    After all, the given quantified statement (in the preceding definition) is an implication.
                    However, the situation here is similar to that given in part (a). What appears to be
                    stated is not what is intended. The intention is for the reader to interpret the given
                    definition as

Vn [q(n) > p(n)],
                    that is,

“For every integer n, we call n even if and only if n is divisible by 2.”

(Note that the open statement “n is divisible by 2” can also be expressed by the open
                    statement “n = 2k, for some integer k.” Don’t be misled here by the presence of
                    the quantifier “for some integer k” — for the expression 4k [n = 2k] is still an open
                    statement because n remains a free variable.)

So now we see how quantifiers may enter into the way we state mathematical defini-
               tions — and that the traditional way in which such a definition appears is as an implication.
               But beware and remember: It is only in definitions that an implication can be (mis)read and
               correctly interpreted as a biconditional.
                    Note how we defined the limit concept in Example 2.50. There we wrote “if (and only
               if )” since we wanted to let the reader know our intention. Now we are free to replace “if
               (and only if)” by simply “if.”
                  Having settled our discussion on the nature of mathematical definitions, we continue
               now with an investigation of arguments involving quantified statements.

Suppose that we start with the universe that comprises only the 13 integers 2, 4, 6, 8,...,
EXAMPLE 2.52
               24, 26. Then we can establish the statement:

For all n (meaning n = 2, 4, 6,..., 26),
                                                    we can write n as the sum of at most three perfect squares.

The results in Table 2.24 provide a case-by-case verification showing the given (quanti-
               fied) statement to be true. (We might call this statement a theorem.)

Table 2.24

2=141                  10=9+1                      20 = 16+4
                                  4=4                    12=4+4+444                  22=94+9+4
                                  6=4+14+1               144=9+4+1                   24= 164+4+4
                                  §=4+4                  16 = 16                     26 = 25+1
                                                         18 = 16+1+1
106         Chapter 2. Fundamentals of Logic

This exhaustive listing is an example of a proof using the technique we call, rather
                             appropriately, the method of exhaustion. This method is reasonable when we are dealing
                             with a fairly small universe. If we are confronted with a situation in which the universe
                             is larger but within the range of a computer that is available to us, then we might write a
                             program to check all of the individual cases.
                                 (Note that for certain cases in Table 2.24 more than one answer may be possible. For
                             example, we could have written 18 = 9 + 9 and 26 = 16 +9 + 1. But this is all right. We
                             were told that each positive even integer less than or equal to 26 could be written as the
                             sum of one, two, or three perfect squares. We were nor told that each such representation
                             had to be unique, so more than one possibility could occur. What we had to check in each
                             case was that there was at least one possibility.)

In the previous example we mentioned the word theorem. We also found this term used in
                             Chapter | — for example, in results like the binomial theorem and the multinomial theorem
                             where we were introduced to certain types of enumeration problems. Without getting overly
                             technical, we shall consider theorems to be statements of mathematical interest, statements
                             that are known to be true. Sometimes the term theorem is used only to describe major
                             results that have many and varied consequences. Certain of these consequences that follow
                             rather immediately from a theorem are termed corollaries (as in the case of Corollary 1.1
                             in Section 1.3). In this text, however, we shall not be so particular in our use of the word
                             theorem.
                                 Example 2.52 is a nice starting point to examine the proof of a quantified statement.
                             Unfortunately, a great number of mathematical statements and theorems often deal with
                             universes that do not lend themselves to the method of exhaustion. When faced with es-
                             tablishing or proving a result for all integers, for example, or for all real numbers, then we
                             cannot use a case-by-case method like the one in Example 2.52. So what can we do?
                                 We start by considering the following rule.

The Rule of Universal Specification: if an open statement becomes true for all
                               replacements by the members in a given universe, then that open statement is true for
                               each specific individual member in that universe. (A bit more symbolically   — if p(x)
                               is an open statement for a given universe, and if Wx p(x) is true, then p(a) is true for
                               each a in the universe.)

This rule indicates that the truth of an open statement in one particular instance follows
                             (as a special case) from the more general (for the entire universe) truth of that universally
                             quantified open statement. The following examples will show us how to apply this idea.

a) For the universe of all people, consider the open statements
      EXAMPLE 2.53
                                         m(x):   x is a mathematics professor         c(x):    x has studied calculus.

Now consider the following argument.

All mathematics professors have studied calculus.
                                                     Leona is a mathematics professor.
                                                     Therefore Leona has studied calculus.
                                  2.5    Quantifiers, Definitions, and the Proofs of Theorems        107

If we let / represent this particular woman (in our universe) named Leona, then we
   can rewrite this argument in symbolic form as

Vx [m(x) > c(x)]
                                              mil)

Here the two statements above the line are the premises of the argument, and the
   statement c(/) below the line is its conclusion. This is comparable to what we saw in
   Section 2.3, except now we have a premise that is a universally quantified statement.
   As was the case in Section 2.3, the premises are all assumed to be true and we must
   try to establish that the conclusion is also true under these circumstances. Now, to
   establish the validity of the given argument, we proceed as follows.
      Steps                               Reasons
      1) Vx [m(x) > c(x)]                 Premise
      2) m(l)                             Premise
      3) mil) > c(l)                      Step (1) and the Rule of Universal Specification
      4) ..c(l)                           Steps (2) and (3) and the Rule of Detachment
       Note that the statements in steps (2) and (3) are not quantified statements. They are
   the types of statements we studied earlier in the chapter. In particular, we can apply
   the rules of inference we learned in Section 2.3 to these two statements to deduce the
   conclusion in step (4).
       We see here that the Rule of Universal Specification enables us to take a universally
   quantified premise and deduce from it an ordinary statement (that is, one that is not
   quantified). This (ordinary) statement    — namely, m(/) — c(/) —is one specific true
   instance of the universally quantified true premise Wx [m(x) > c(x)].
b) For an example of a more mathematical nature let us consider the universe of all
   triangles in the plane in conjunction with the open statements

p(t):         t has two sides of equal length.
                          q(t):         tis an isosceles triangle.
                          r(t):         thas two angles of equal measure.

Let us also focus our attention on one specific triangle with no two angles of equal
   measure. This triangle will be called triangle XYZ and will be designated by c. Then
   we find that the argument

In triangle XYZ there is no pair of angles of equal
     measure.                                                                           —=r{(c)
      If a triangle has two sides of equal length, then it is
      isosceles.                                                                        Vt [p(t) > ¢(t)]
      If a triangle is isosceles, then it has two angles of equal
     measure.                                                                           Vt [q(t) > r(t)]
      Therefore triangle XYZ has no two sides of equal length.                          apc)

is a valid one —as evidenced by the following.
108   Chapter 2. Fundamentals of Logic

Steps                              Reasons
                                 1) Wt (p(t) > qt]                  Premise
                                 2)   plc) >        g(c)            Step (1) and the Rule of Universal Specification
                                 3)   Wt [g(t) > r(t)]              Premise
                                 4)   g(c) > r(c)                   Step (3) and the Rule of Universal Specification
                                 5)   plc) > r{c)                   Steps (2) and (4) and the Law of the Syllogism
                                 6)   —=r(c)                        Premise
                                 7)   “. 4ptc)                      Steps (5) and (6) and Modus Tollens
                                Once again we see how the Rule of Universal Specification helps us. Here it has
                             taken the universally quantified statements at steps (1) and (3) and has provided us
                             with the (ordinary) statements at steps (2) and (4), respectively. Then at this point we
                             were able to apply the rules of inference we learned in Section 2.3 (namely, the Law
                             of the Syllogism and Modus Tollens) to derive the conclusion —p(c) in step (7).
                          c) Now for one last argument to drive the point home! Here we’ll consider the universe
                             to be made up of the entire student body at a particular college. One specific student,
                             Mary Gusberti, will be designated by m.
                                 For this universe and the open statements

jJ(x):   x is ajunior        s(x):   x is asenior
                                                       p(x):    x is enrolled in a physical education class

we consider the argument:

No    junior or senior is enrolled in a physical education class.
                                               Mary Gusberti is enrolled in a physical education class.
                                               Therefore Mary Gusberti is not a senior.

In symbolic form this argument becomes

Vx [(7(x) V s(x) > mp)]
                                                                   p(m)
                                                                 J“. as(m)

Now the following steps (and reasons) establish the validity of this argument.
                                 Steps                                           Reasons
                                 1) Vx (Gi) V s(x) > ap(x)]                      Premise
                                 2)   p(m)                                       Premise
                                 3)   (J(m) V s(m)) > —p(m)                      Step (1) and the Rule of Universal
                                                                                    Specification
                                 4)   p(m) > 7(j(m) V s(n))                      Step (3), (¢ — t) <=> (-t > -q), and the
                                                                                    Law of Double Negation
                                 5)   p(m) + (7j(m) A -s(m))                     Step (4) and DeMorgan’s Law
                                 6)   —j(m)         A 7s(m)                      Steps (2) and (5) and the Rule of
                                                                                    Detachment (or Modus Ponens)
                                 7)   7. -s5(m)                                  Step (6) and the Rule of Conjunctive
                                                                                    Simplification

In Example 2.53 we have had our first opportunity to apply the Rule of Universal Speci-
                       fication. Using the rule in conjunction with Modus Ponens (or the Rule of Detachment) and
                                          2.5   Quantifiers, Definitions, and the Proofs of Theorems        109

Modus Tollens, we are able to state the following corresponding analogs, each of which
involves a universally quantified premise. In either case we consider a fixed universe that
includes a specific member c and make use of the open statements p(x), g(x) defined for
this universe.

(I)           Wx [p(x) > g(x)]                  (2)      Wx [p(x) > ¢g(x)]
                               p(c)                                       74 (c)
                         “.qlc)                                        J. aplc)
These two valid arguments are presented here for the same reason we presented them for the
rules of inference — Modus Ponens and Modus Tollens — in Section 2.3 (in the discussion
between Examples 2.25 and 2.26). We want to examine some possible errors that may arise
when the results in (1) and (2) are not used correctly.
    Let us start with the universe of all polygons in the plane. Within this universe we shall
let c denote one specific polygon — the quadrilateral EF GH, where the measure of angle
E is 91°. For the open statements

p(x):          x is a square           q(x):      x has four sides,

the following argument is invalid.
   (1’)                                 All squares have four sides.
                                        Quadrilateral EF GAH has four sides.
                                        Therefore quadrilateral EFGH            is a square.
In symbolic form this argument translates into
   (1”)                                           Vx [p(x) > q(x)]
                                                  qc)
                                                J. ple)

Unfortunately, although the premises are true, the conclusion is false. (For a square has no
angle of measure 91°.) We admit that there might be some confusion between this argument
and the valid one in (1) above. For when we apply the Rule of Universal Specification to
the quantified premise in (1”), in this instance we arrive at the invalid argument

p(c) > gc)
                                                      q(c)
                                                  . pc)
And here, as in Section 2.3, the error in reasoning lies in our attempt to argue by the converse.
   A second invalid argument      — from the misuse of argument (2) above —can also be
given, as shown in the following.
   (2’)                                 All squares have four sides.
                                        Quadrilateral EF GH is not a square.
                                        Therefore quadrilateral EFGH            does not have four sides.

Translating (2’) into symbolic form results in

(2”)                                           Vx [p(x) > ¢(x)]
                                                  TPC)
                                                “7g (C)
Chapter 2 Fundamentals of Logic

This time the Rule of Universal Specification leads us to

p(c) > g(c)
                                                            apc)
                                                         J“. aq(c)

where the fallacy arises because we are trying to argue by the inverse.

And now let us look back at the three parts of Example 2.53. Although the arguments
                presented there involved premises that were universally quantified statements, there was
                never any instance where a universally quantified statement appeared in the conclusion. We
                now want to remedy this situation, since many theorems in mathematics have the form of
                a universally quantified statement. To do so we need the following considerations.
                    Start with a given universe and the open statement p(x). To establish the truth of the
                statement Vx p(x), we must establish the truth of p(c) for each member c in the given
                universe. But if the universe has many members or, for example, contains all the positive
                integers, then this exhaustive, if not exhausting, task of validating the truth of each p(c)
                becomes difficult, if not impossible. To get around this situation we shall prove that p(c)
                is true  — but now we do it for the case where c denotes a specific but arbitrarily chosen
                member from the prescribed universe.
                    Should the preceding open statement p(x) have the form g(x) > r(x), for open state-
                ments g(x) and r(x), then we shall assume the truth of g(c) as an additional premise and try
                to deduce the truth of r(c) — by using definitions, axioms, previously proven theorems, and
                the logical principles we have studied. For when g(c) is false, the implication g(c) > r(c)
                is true, regardless of the truth value of r(c).
                   The reason that the element c must be arbitrary (or generic) is to make sure that what
                we do and prove about c is applicable for all the other elements in the universe. If we are
                dealing with the universe of all integers, for example, we cannot choose c in an arbitrary
                manner by selecting c as 4, or by selecting c as an even integer. In general, we cannot
                make any assumptions about our choice for c unless these assumptions are valid for all the
                other elements of the universe. The word generic is applied to the element c here because it
                indicates that our choice (for c) must share all of the common characteristics of the elements
                for the given universe.
                    The principle we have described in the preceding three paragraphs is named and sum-
                marized as follows.

The Rule of Universal Generalization: If an open statement p(x) is proved to be
                  true when x is replaced by any arbitrarily chesen element c from our universe, then the
                  universally quantified statement Vx p(x) is true. Furthermore, the rule extends beyond
                  a Single variable. So if, for example, we have an open statement q(x, y) that is proved
                  to be true when x and y are replaced by arbitrarily chosen elements from the same
                  universe, or their own respective universes, then the universally quantified statement
                  Wx Vy g(x, y) for, Vx, y g(x, y)] is true. Similar results hold for the cases of three or
                   more variables,   |

Before we demonstrate the use of this rule in any examples, we wish to look back at
                part (1) of Example 2.43 in Section 2.4. It turns out that the explanation given there to
                establish that

Vx [p(x) A (q(x) Ar(x))] =         Vx [(p(Qx) A g(x) Ar(x)]
                                                     2.5    Quantifiers, Definitions, and the Proofs of Theorems         WI

anticipated what we have now described in detail as the Rules of Universal Specification
                   and Universal Generalization.
                      Now we’ll turn to an example which is strictly symbolic. This example provides an
                   opportunity to apply the Rule of Universal Generalization.

Let p(x), g(x), and r(x) be open statements that are defined for a given universe. We show
   EXAMPLE 2.54
                   that the argument

Vx [p(x) > q(x)]
                                                             Vx [g(x) > r(x)]
                                                            “Wx [p(x) > r(x)]
                   is valid by considering the following.

Steps                            Reasons
                            1) Vx [(p(x) > g(x)]             Premise
                            2) p(c) > q(c)                   Step (1) and the Rule of Universal Specification
                            3) Vx [g(x) > r(x)]              Premise
                            4) g(c) > r(c)                   Step (3) and the Rule of Universal Specification
                            5) p(c) > r(c)                   Steps (2) and (4) and the Law of the Syllogism
                            6) «Vx [p(x) > r(x)]             Step (5) and the Rule of Universal Generalization
                      Here the element c introduced in steps (2) and (4) is the same specific but arbitrarily
                   chosen element from the universe. Since this element c has no special or distinguishing
                   properties but does share all of the common features of every other element in this universe,
                   we can use the Rule of Universal Generalization to go from step (5) to step (6).
                      And so at last we have dealt with a valid argument where a universally quantified state-
                   ment appears as the conclusion, as well as among the premises.

The question that now may be at the back of the reader’s mind is one of practicality.
                   Namely, when would we ever need to use the argument that we validated in Example 2.54?
                   We may find that we have already used it (perhaps, unknowingly) in earlier algebra and
                   geometry courses, as we demonstrate in the following example.

a) For the universe of all real numbers, consider the open statements
| EXAMPLE 2.55 |
                                   p(x):       3x —7=20             q(x):      3x =27               r(x):    x =9,

The following solution of an algebraic equation parallels the valid argument from
                        Example 2.54.
                        1) If 3x —7 = 20, then 3x = 27.                                     Vx [p(x) > ¢(x)]
                        2) If 3x = 27, then x = 9.                                          Vx [g(x) > r(x)]
                        3) Therefore, if 3x — 7 = 20, then x = 9.                         7 Wx [p(x) > r(x)]
                     b) When we dealt with the universe of all quadrilaterals in plane geometry, we may have
                        found ourselves relating something like this:

“Since every square is a rectangle, and every rectangle
                                   is a parallelogram, it follows that every square is a parallelogram.”

In this case we are using the argument in Example 2.54 for the open statements

p(x):      x is a square        q(x):       x is arectangle             r(x):      x 1s a parallelogram.
12         Chapter 2 Fundamentals of Logic

Now we continue with one more argument to validate.

The steps and reasons needed to establish the validity of the argument
     EXAMPLE 2.56
                                                                  Vx [p(x) V 4(x)]
                                                                  Vx [(>p(x) A g(x)) > r(x)]
                                                                 ox [ar(x) > p(x)]
                           are given as follows. [Here the element c is in the universe assigned for the argument. Also,
                           since the conclusion is a universally quantified implication, we can assume —r(c) as an
                           additional premise— as was mentioned earlier when the Rule of Universal Generalization
                            was first introduced.|

Steps                                             Reasons
                                1) Vx [p(®*) V g(x)                              Premise
                                2) ple) V g(c)                                   Step (1) and the Rule of Universal
                                                                                    Specification
                                3)     Vx ((~p(x) Aqg(x)) > r()]                 Premise
                                4)     [—p(c) Aqg(e)] > r(c)                     Step (3) and the Rule of Universal
                                                                                    Specification
                                5)     -=r(c) >   -[-p{c)      A qg(c)]          Step (4) ands >   t <> -t >    77s
                                6)     —r(c) > [p(c) V -q(c)]                    Step (5), DeMorgan’s Law, and the Law of
                                                                                    Double Negation
                                7)     -r(c)                                     Premise (assumed)
                                8)     p(c) V 7q(c)                              Steps (7) and (6) and Modus Ponens
                                9)     [p(c) V g(e)] A [p(e) V -¢(c)]            Steps (2) and (8) and the Rule of Conjunction
                               10)     pic) Vv [g(c) A -q{c)]                    Step (9) and the Distributive Law of V over A
                               11)     p(c)                                      Step (10), g(c) A mq(c) <>    Fo, and
                                                                                   p(c) V Fo => plc)
                               12)     «. Vx [=r(x) > p(x)]                      Steps (7) and (11) and the Rule of Universal
                                                                                    Generalization

Before going on we want to point out a convention that the reader may not like but
                            will have to get used to. It concerns our coverage of the Rules of Universal Specification
                            and Universal Generalization. In the first case we started with the statement Vx p(x) and
                            then dealt with p(c) for some specific element c in our universe. For the Rule of Universal
                            Generalization we obtained the truth of Vx p(x) from that of p(c), where c was arbitrarily
                            selected    from   the universe.    Unfortunately,   we'll often find ourselves   using   the letter x
                            instead of c to denote the element — but as long as we understand what is happening we
                            shall soon find the convention easy enough to work with.

The results of Example 2.54 and especially Example 2.56 lead us to believe that we can
                           use universally quantified statements and the rules of inference      — including the Rules of
                           Universal Specification and Universal Generalization — to formalize and prove a variety of
                           arguments and, hopefully, theorems. When we do so it appears that the validation of some
                           rather short arguments requires quite a number of steps, because we have been very metic-
                           ulous and included all the steps and reasons — we left little, if anything, to the imagination.
                           The reader should rest assured that when we start to prove mathematical theorems, we shall
                           present the proofs in the more conventional paragraph style. We shall no longer mention
                                                       2.5   Quantifiers, Definitions, and the Proofs of Theorems   113

each and every application of the laws of logic and the other tautologies or the rules of
                     inference. On occasion we may single out a certain rule of inference, but our attention will
                     be primarily directed to the use of definitions, mathematical axioms and principles (other
                     than those we have found in our study of logic), and other (earlier) theorems we have been
                     able to prove. Why then have we been learning all of this material on validating arguments?
                     Because it will provide us with a framework to fall back on whenever we doubt whether
                     a given attempt at a proof really does the job. If in doubt, we have our study of logic to
                     supply us with a somewhat mechanical but strictly objective means to help us decide.
                         And now we present paragraph-style proofs for some results about the integers. (These
                     results may be considered rather obvious to us—in fact, we may find we have already
                     seen and used some of them. But they provide an excellent setting for writing some simple
                     proofs.) The proofs we shall presently introduce use the following ideas, which we now
                     formally define. [The first idea was mentioned earlier in part (b) of Example 2.51.]

Definition 2.8   Let n be an integer. We call n even if n is divisible by 2 — that is, if there exists an integer
                     r so that n = 2r. If n is not even, then we call n odd and find for this case that there exists
                     an integer s where n = 2s + 1.

THEOREM 2.2          For all integers k and J, if k, 1 are both odd, then k + / is even.
                     Proof: In this proof we shall number the steps so that we may refer to them for some later
                     remarks. After this we shall no longer number the steps.

1) Since k and / are odd, we may write k = 2a + 1 and / = 2b + 1, for some integers
                           a, b. This is due to Definition 2.8.
                        2) Then

k+l=(Qa+1)4+(2b4+1)                  =2(a+b4+)),
                           by virtue of the Commutative and Associative Laws of Addition and the Distributive
                           Law of Multiplication over Addition — all of which hold for integers.
                        3) Since a, b are integers, a + b + 1 = c is an integer; with k + / = 2c, it follows from
                           Definition 2.8 that k + / is even.

Remarks

1) In step (1) of the preceding proof k and? were chosen in an arbitrary manner, so we
                           know by the Rule of Universal Generalization that the result obtained is true for all
                           odd integers.
                        2) Although we may not realize it, we are using the Rule of Universal Specification
                           (twice) in step (1). The first argument implicit in this step reads as follows.
                             i) If is an odd integer, then n = 2r + | for some integer r.
                            ii) The integer k is a specific (but arbitrarily chosen) odd integer.
                           iii) Therefore we may write k = 2a + 1 for some (specific) integer a.
                        3) In step (1) we do not have k = 2a + 1 and / = 2a +1. Since k, / are arbitrarily
                           chosen, it may be the case that k = /— and when this happens we have 2a + 1 =
                           k =1 = 2b + 1, from which it follows thata = b. [Since k may not equal /, it follows
114         Chapter 2 Fundamentals of Logic

that (k — 1)/2 =a may not equal b = (J — 1)/2. Thus we should use the different
                                      variables a and b.]
                                Before we proceed with another theorem — written in the more conventional manner —
                            let us examine the following.

Consider the following statement for the universe of integers.
)     EXAMPLE 2.57
                                                   If n is an integer, then n? = n —or, Vn [n? = n].

Now for n = 0 it is true that n* = 0? = 0 = n. And ifn = 1, it is also true that n? = 1? =
                             1 = n. However, we cannot conclude n? = n for every integer n. The Rule of Universal
                             Generalization does not apply here, for we cannot consider the choice of 0 (or 1) as an
                             arbitrarily chosen integer. If n = 2, we have n* = 4 4 2 =n, and this one counterexample
                             is enough to tell us that the given statement is false. However, either replacement — namely,
                             n = 0 orn — | —is enough to establish the truth of the statement:

For some integer n, n 2   =   n—or,   dn [n? =n].

We close — at last — with three results to demonstrate how we shall write proofs through-
                             out the remainder of the text.

THEOREM 2.3                  For all integers & and /, if k and? are both odd, then their product k/ is also odd.
                            Proof:    Since k and / are both odd,   we may    write k = 2a + 1 and / = 2b + 1, for some
                             integers a and b —because of Definition 2.8. Then the product k/ = (2a + 1)(2b+ 1) =
                             4ab + 2a + 2b4 1 = 2(2ab+a+b) +1, where 2ab + a + bis an integer. Therefore, by
                             Definition 2.8 once again, it follows that ki is odd.

The preceding proof is an example of a direct proof. In our next example we shall prove
                             a result in three ways: first by a direct argument (or proof), then by the contrapositive
                             method, and finally by the method of proof by contradiction. [For the (method of) proof
                             by contradiction we put in some extra details, since this is our first opportunity to use this
                             technique.] The reader should not assume, however, that every theorem can be so readily
                             proved in a variety of ways.

THEOREM 2.4                  If m is an even integer, then m + 7 is odd.
                             Proof:

1) Since m is even, we have m = 2a for some integer a. Then m +7 = 2a +7 =
                                      2a+6+    1 =2(a +3) + 1. Since a + 3 is an integer, we know that m + 7 is odd.
                                 2) Suppose that m +-7 is not odd, hence even. Then m + 7 = 2b for some integer b
                                    and m = 2b -7 = 2b-—8+1=2(b —4) +1, where b — 4 is an integer. Hence
                                    m is odd. [The result follows because the statements Vin [p(m) —- q(m)] and
                                    Vm[-g(m) > —p(m)] are logically equivalent.]
                                               2.5   Quantifiers, Definitions, and the Proofs of Theorems             115

3) Now assume that m is even and that m +7 is also even. (This assumption is the
                    negation of what we want to prove.) Then m + 7 even implies that m + 7 = 2c for
                    some integer c. And, consequently, m = 2c — 7 = 2c —~8+1=2(c —4) +1 with
                    c — 4an integer, so m is odd. Now we have our contradiction. We started with m even
                    and deduced m odd — an impossible situation, since no integer can be both even and
                    odd. How did we arrive at this dilemma? Simple — we made a mistake! This mistake
                    is the false assumption — namely, m + 7 is even -—that we wanted to believe at the
                    start of the proof. Since the assumption is false, its negation is true, and so we now
                    have m + 7 odd.

The second and third proofs for Theorem 2.4 appear to be somewhat similar. This is
              because the contradiction we derived in the third proof arises from the hypothesis of the
              theorem and its negation. We shall see as we progress (as early as the next chapter) that a
              contradiction may also be obtained by deriving the negation of a known fact —a fact that
              is not the hypothesis of the theorem we are attempting to prove. For now, however, let us
              think about this similarity a little more. Suppose we start with the open statements p(m)
              and g(m)—for a prescribed universe — and consider a theorem of the form Vm [ p(m)
              q(m)]. If we try to prove this result by the contrapositive method, then we shall actually
              prove the logically equivalent statement Vin [—g(m) — —p(m)]. To do so we assume the
              truth of —q(m) (for any specific but arbitrarily chosen m in the universe) and show that
              this leads to the truth of —p(m). On the other hand, if we wish to prove the theorem
              Vm [p(m) — q(m)] by the method of proof by contradiction, then we assume that the
              statement Wm   [p(m) —   qg(m)] 1s false. This amounts to the fact that p(m) —                q(m) is false
              for at least one replacement for m from the universe   — that is, there is some element m
              in the universe for which p(m) is true and qg(m) is false [or ~g(m) is true]. We then use
              the truth of p(m) and —g(m) to derive a contradiction. [In the third proof of Theorem 2.4
              we obtained p(m) A —p(m).] These two methods can be compared symbolically in the
              following — where m is specific but arbitrarily chosen for the method of contraposition.
                                          Assumption                       Result Derived
                Contraposition            —q(m)                            —p(m)
                 Contradiction            p(m) and -q(m)                   Fo

In general, when we are able to establish a theorem by either a direct proof or an indirect
              proof, the direct approach is less cumbersome than an indirect approach. (This certainly
              appears to be the case for the three proofs presented for Theorem 2.4.) When we do not
              have any prescribed directions given for attempting the proof of a certain theorem, we might
              Start with a direct approach. If we succeed, then all is well. If not, then we might consider
              trying to find a counterexample to what we thought was a theorem. Should our search for
              a counterexample fail, then we might consider an indirect approach. We might prove the
              contrapositive of the theorem, or obtain a contradiction, as we did in the third proof of
              Theorem 2.4, by assuming the truth of the hypothesis and the truth of the negation of the
              conclusion (for some element m in the universe) in the given theorem.

We close this section with one more indirect proof by the method of contraposition.

THEOREM 2.5   For all positive real numbers x and y, if the product xy exceeds 25, then x > Sory>5.
              Proof: Consider the negation of the conclusion— that is, suppose that 0 < x <5 and 0 <
              y <5. Under these circumstances we find thatO = 0-0<x-y<5-5 = 25,so the product
116            Chapter 2 Fundamentals of Logic

xy does not exceed 25. (This indirect method of proof now establishes the given statement,
                                 since we know that an implication is logically equivalent to its contrapositive.)

b) All law-abiding citizens pay their taxes.
                                                                           Mr. Pelosi pays his taxes.
                                                                           Therefore Mr. Pelosi is a law-abiding citizen.
1. In Example 2.52 why did we stop at 26 and not at 28?
                                                                           c) All people who are concerned about the environment
2. In Example 2.52 why didn’t we include the odd integers                 recycle their plastic containers.
between 2 and 26?                                                          Margarita is not concerned about the environment.
3. Use the method of exhaustion to show that every even in-               Therefore Margarita does not recycle her plastic containers.
teger between 30 and 58 (including 30 and 58) can be written            7. For a prescribed universe and any open statements p(x),
as a sum of at most three perfect squares.                             q(x) in the variable x, prove that
  4, Let n be a positive integer greater than 1. We call n prime           a) Sx [p(x) V g(x)] &        Ax p(x) v Ax g(x)
if the only positive integers that (exactly) divide n are 1 and
                                                                           b) Wx [p(x) A g(x)] <=> Vx px) A Wx q(x)
n itself. For example, the first seven primes are 2, 3, 5, 7, 11,
13, and 17. (We shall learn more about primes in Chapter 4.)           8. a) Let p(x), q(x) be open statements in the variable x, with
Use the method of exhaustion to show that every integer in the             a given universe. Prove that
universe 4, 6, 8,..., 36, 38 can be written as the sum of two                         Vx p(x) V Wx g(x) => Wx [p@) Vv g(x)].
primes.
                                                                           [That is, prove that when the statement Vx p(x) V Vx q(x)
5. For each of the following (universes and) pairs of state-              is true, then the statement Vx [p(x) Vv qg(x)] is true.]
ments, use the Rule of Universal Specification, in conjunction
                                                                           b) Find a counterexample for the converse in part (a). That
with Modus Ponens and Modus Tollens, in order to fill in the
                                                                           is, find open statements p(x), g(x) and a universe such that
blank line so that a valid argument results.
                                                                           Vx [p(x) V q(x)]is true, while Vx p(x) Vv Vx q(x) is false.
      a) [The universe comprises all real numbers.]
                                                                        9. Provide the reasons for the steps verifying the following
      All integers are rational numbers.
                                                                       argument. (Here a denotes a specific but arbitrarily chosen ele-
      The real number 7 is not a rational number.
                                                                       ment from the given universe.)

Vx [p(x) > (g(x) Ar(x))]
      b) [The universe comprises the present population of the
                                                                                            Vx [p(x) A s(x)]
      United States.]
      All librarians know the Library of Congress Classification                          Ox    Er Ox) A s(x}
      System.
                                                                          Steps                                        Reasons
      ., Margaret knows the Library of Congress Classification              1) Vx [p(x) > (g(x) Ar(x))]
      System.                                                              2) Wx [p(x) A s(x)]
                                                                           3) p(a) > (g(a) Ar(a))
      c) [The same universe as in part (b).]
                                                                           4) pla) As(a)
                                                                           5) p(a)
      Sondra is an administrative director.
                                                                           6) g(a) Ar(a)
      ... Sondra knows how to delegate authority.
                                                                           7) r(a)
      d) [The universe consists of all quadrilaterals in the plane.]       8) s(a)
      All rectangles are equiangular.                                       9)    r(a)  A s(a)
                                                                          10)     7. Vx [r(x)  A s(x]
      ., Quadrilateral MN PQ is not a rectangle.
                                                                       10. Provide the missing reasons for the steps verifying the fol-
  6. Determine which of the following arguments are valid and          lowing argument:
which are invalid. Provide an explanation for each answer. (Let
the universe consist of all people presently residing in the United                              Vx [p(x) V q(x)]
States.)                                                                                         Ax sp(x)
                                                                                                 Wx [>4(x) V r(x)]
      a) All mail carriers carry a can of mace.
                                                                                                 Vx [s(x) > ar @)]
      Mrs. Bacon is a mail carrier,
                                                                                                 dx as(x)
      Therefore Mrs. Bacon carries a can of mace.
                                                                                               2.6 Summary and Historical Review               117

Steps                            Reasons                                12, Give a direct proof (as in Theorem 2.3) for each of the
    1) Vx [p(x) Vv g(x)]            Premise                                following.
    2) Sx >p(x)                     Premise
                                                                                a)     For all integers & and /, if k, / are both even, then k + /
    3)    -p(a)                     Step (2) and the definition of
                                                                                is even.
                                    the truth for dx — p(x). [Here
                                    a is an element (replacement)               b) For all integers k and /, if k, / are both even, then &/ is
                                    from the universe for which                 even.
                                    — p(x) is true.] The reason for        13. For each of the following statements provide an indirect
                                    this step is also referred to as       proof [as in part (2) of Theorem 2.4] by stating and proving the
                                    the Rule of Existential                contrapositive of the given statement.
                                    Specification.                              a) For all integers k and /, if k/ is odd, then k, / are both
    4) p(a)v q(a)                                                               odd.
    5) q(a)
                                                                                b) For all integers k and /, if k + / is even, then k and? are
    6) Vx [-=¢(x) V r(x)]
                                                                                both even or both odd.
    7) —q(a) Vv r(a)
    8) g(a) > r(a)                                                         14. Prove that for every integer n, if n is odd, then n? is odd.
     9)   r(a)                                                             15, Provide a proof by contradiction for the following:             For
   10)    Vx [s(x) > -r(x)]                                                every integer n, if n” is odd, then n is odd.
   11)    s(a) > -r(a)                                                     16. Prove that for every integer n, n? is even if and only if n is
   12)    r(a) > -s(a)                                                     even.
   13)    -s(a)
                                                                           17. Prove the following result in three ways (as in Theorem
   14)    .. Ax -5(x)               Step (13) and the definition
                                    of the truth for dx —s(x). The         2.4): Ifn is an odd integer, then n + 11 is even.
                                    reason for this step is also           18. Let m, n be two positive integers. Prove that if m,n are
                                    referred to as the Rule of             perfect squares, then the product mn is also a perfect square.
                                    Existential Generalization.            19, Prove or disprove: If m,n are positive integers and m, n
11. Write the following argument in symbolic form. Then either             are perfect squares, then m + n is a perfect square.
verify the validity of the argument or explain why it is invalid.          20. Prove or disprove: There exist positive integers m, n,
[Assume here that the universe comprises all adults (18 or over)           where m,n, and m + n are all perfect squares.
who are presently residing in the city of Las Cruces (in New               21. Prove that for all real numbers x and y, ifx + y > 100, then
Mexico). Two of these individuals are Roxe and Imogene.]                   x > 50 or y > 50.
    All credit union employees must know COBOL. All credit
                                                                           22, Prove that for every integer n, 4n + 7 is odd.
union employees who write loan applications must know Ex-
cel.’ Roxe   works   for the credit union,    but she doesn’t know         23. Let n be an integer. Prove that n is odd if and only if 7n + 8
Excel. Imogene knows Excel but doesn’t know COBOL. There-                  is odd.
fore Roxe doesn’t write loan applications and Imogene doesn’t              24, Let n be an integer. Prove that n is even if and only if
work for the credit union.                                                 31n + 12 is even.

2.6
          Summary and Historical Review
                                 This second chapter has introduced some of the fundamentals of logic — in particular, some
                                 of the rules of inference and methods of proof necessary for establishing mathematical
                                 theorems.
                                    The first systematic study of logical reasoning is found in the work of the Greek philoso-
                                 pher Aristotle (384-322 B.c.). In his treatises on logic Aristotle presented a collection of
                                 principles for deductive reasoning. These principles were designed to provide a foundation

“The Excel spreadsheet is a product of Microsoft, Inc.
118   Chapter 2 Fundamentals of Logic

for the study of all branches of knowledge. In a modified form, this type of logic was taught
                      up to and throughout the Middle Ages.

Aristotle (384-322 8.c.)

The German mathematician Gottfried Wilhelm Leibniz (1646-1716) is often considered
                      the first scholar who seriously pursued the development of symbolic logic as a universal
                      scientific language. This he professed in his essay De Arte Combinatoria, published in 1666.
                      His research in the area of symbolic logic, carried out from 1679 to 1690, gave considerable
                      impetus to the creation of this mathematical discipline.
                          Following the work by Leibniz, little change took place until the nineteenth century, when
                      the English mathematician George Boole (1815-1864) created a system of mathematical
                      logic that he introduced in 1847 in the pamphlet The Mathematical Analysis of Logic,
                      Being an Essay Towards a Calculus of Deductive Reasoning. In the same year, Boole’s
                      countryman Augustus DeMorgan (1806-1871) published Formal Logic; or, the Calculus
                      of Inference, Necessary and Probable. In some ways this treatise extended Boole’s work

George Boole (1815-1864)
                                                 2.6 Summary and Historical Review         119

considerably. Then, in 1854, Boole detailed his ideas and further research in the notable
work An Investigation in the Laws of Thought, on Which Are Founded the Mathematical
Theories of Logic and Probability. The American logician Charles Sanders Peirce (1839-
1914), who was also an engineer and philosopher, introduced the formal concept of the
quantifier into the study of symbolic logic.
   The concepts formulated by Boole were thoroughly examined in the work of another
German scholar, Ernst Schréder (1841-1902). These results are known collectively as Vor-
lesungen tiber die Algebra der Logik; they were published in the period from 1890 to
1895,
   Further developments in the area saw an even more modern approach evolve in the work
of the German logician Gottlieb Frege (1848-1925) between 1879 and 1903. This work
significantly influenced the monumental Principia Mathematica (1910-1913) by England’s
Alfred North Whitehead (1861-1947) and Bertrand Russell (1872-1970). Here what was
begun by Boole was finally brought to fruition. Thanks to this remarkable effort and the work
of other twentieth-century mathematicians and logicians, in particular the comprehensive
Grundlagen der Mathematik (1934-1939) of David Hilbert (1862-1943) and Paul Bernays
(1888-1977), the more polished techniques of contemporary mathematical logic are now
available.
    Several sections of this chapter stressed the importance of proof. In mathematics a proof
bestows authority on what might otherwise be dismissed as mere opinion. Proof embodies
the power and majesty of pure reason. But even more than that, it suggests new mathematical
ideas. Our concept of proof goes hand in hand with the notion of a theorem — a mathematical
statement the truth of which has been confirmed by means of a logical argument, namely, a
proof. For those who feel they can ignore the importance of logic and the rules of inference,
we submit the following words of wisdom spoken by Achilles in Lewis Carroll’s What the
Tortoise Said to Achilles: “Then Logic would take you by the throat, and force you to do
it!”
     Comparable coverage of the material presented in this chapter can be found in Chapters
2 and 11 of the text by K. A. Ross and C. R. B. Wright [11]. The first two chapters of the
text by S. S. Epp [3] provide many examples and some computer science applications for
those who wish to see more on logic and proof at a very readable introductory level. The
text by H. Delong [2] provides an historical survey of mathematical logic, together with an
examination of the nature of its results and the philosophical consequences of these results.
This is also the case with the texts by H. Eves and C. V. Newsom     [4], R. R. Stoll [13], and
R. L. Wilder [14], wherein the relationships among logic, proof, and set theory (the topic
of our next chapter) are examined in their roles in the foundations of mathematics.
   For more on resolution (introduced in Exercise 13 of Section 2.3) and automated rea-
soning, the reader should examine the texts by J. H. Gallier [6] and M. R. Genesereth and
N. J. Nilsson [7].
   The text by E. Mendelson [9] provides an interesting intermediate introduction for those
readers who wish to pursue additional topics in mathematical logic. A somewhat more
advanced treatment is given in the work of S. C. Kleene      [8]. Accounts   of other work in
mathematical logic are presented in the compendium edited by J. Barwise [1].
   The objective of the works by D. Fendel and D. Resek [5] and R. P. Morash [10] is to
prepare the student with a calculus background for the more theoretical mathematics found
in abstract algebra and real analysis. Each of these texts provides an excellent introduction
to the basic methods of proof. The unique text by D. Solow [12] is devoted entirely to
introducing the reader who has a background in high school mathematics to the primary
techniques used in writing mathematical proofs.
120             Chapter 2. Fundamentals of Logic

REFERENCES
                                      l. Barwise, Jon (editor). Handbook of Mathematical Logic. Amsterdam: North Holland, 1977.
                                      2. Delong, Howard. A Profile of Mathematical Logic. Reading, Mass.: Addison-Wesley, 1970.
                                      3. Epp, Susanna S. Discrete Mathematics with Applications, 2nd ed. Boston, Mass.: PWS Pub-
                                         lishing Co., 1995.
                                          . Eves, Howard, and Newsom, Carroll V. An Introduction to the Foundations and Fundamental
                                            Concepts of Mathematics, rev. ed. New York: Holt, 1965.
                                          . Fendel, Daniel, and Resek, Diane. Foundations of Higher Mathematics. Reading,                            Mass.:
                                            Addison-Wesley, 1990.
                                          . Gallier, Jean H. Logic for Computer Science. New York: Harper & Row, 1986.
                                          . Genesereth,   Michael   R., and Nilsson, Nils J. Logical Foundations                 of Artificial Intelligence.
                                            Los Altos, Calif: Morgan Kaufmann, 1987.
                                          . Kleene, Stephen C. Mathematical Logic. New York: Wiley, 1967.
                                          . Mendelson, Elliott. Introduction to Mathematical Logic, 3rd ed. Monterey, Calif.: Wadsworth
                                           and Brooks/Cole,    1987.
                                          . Morash, Ronald P. Bridge to Abstract Mathematics: Mathematical Proof and Structures. New
                                           York: Random     House/Birkhaitiser,     1987.
                                     1] . Ross, Kenneth A., and Wright, Charles R. B. Discrete Mathematics, 4th ed. Upper Saddle
                                           River, N.J.: Prentice-Hall,   1999.
                                     12. Solow, Daniel. How to Read and Do Proofs, 3rd ed. New York: Wiley, 2001.
                                     13. Stoll, Robert R. Set Theory and Logic. San Francisco: Freeman, 1963.
                                     14, Wilder, Raymond L. Introduction to the Foundations of Mathematics, 2nd ed. New York:
                                         Wiley, 1965.

7, a) For primitive statements p, q, find the dual of the state-
            SUPPLEMENTARY EXERCISES                                                 ment (sp A 7g) V (Ty A p) V p.
                                                                                    b) Use the laws of logic to show that your result from
                                                                                    part (a) is logically equivalent to p A 7q.
1. Construct the truth table for
                                                                               8. Let p,q, r, and s be primitive statements. Write the dual
                    pelqar)>             As Vr).                             of each of the following compound statements.
                                                                                    a) (pV 7q) A(7r Vs)
2.   a) Construct the truth table for
                                                                                    b) p>     (¢A-7rdAs)
                         (p> gq) Apr).                                              C) (PVT)IAGV               Fol Vv [rAs
                                                                                                                        A To]
                                                                                 9. For each of the following, fill in the blank with the word
      b) Translate the statement in part (a) into words such that
                                                                             converse, inverse, or contrapositive so that the result is a true
      the word “not” does not appear in the translation.
                                                                             statement.
3. Let p,q, and r denote primitive statements. Prove or dis-                       a) The    converse         of     the    inverse of     p—g      is   the
prove (provide a counterexample for) each of the following.                                                                    of p> q.

a   ipo   qeoni=(pog                  er                                      b) The    converse         of     the    inverse   of   p—g      is   the
      b) [p> gon) elprg-r)                                                                                                     of g > p.
  4, Express the negation of the statement p <> q in terms of                       c) The    inverse     of        the     converse   of   p—>q_    is   the
the connectives A and v.                                                                                                       of p>   g.

5. Write the following statement as an implication in two                          d) The    inverse     of        the     converse   of   p—q_     is   the
ways, each in the if-then form: Either Kaylyn practices her piano                                                               of g > p.
lessons or she will not go to the movies,                                           e) The   inverse     of the           contrapositive of p—q       is the
                                                                                                                                of p> q.
6. Let p, g, r denote primitive statements. Write the converse,
                                                                             10. Establish the validity of the argument
inverse, and contrapositive of
      a) p>     (qAr)                b)    (pVq)>r                                                          (pos).
                                                                                            (p> q@Al@Ar)>sl]Ar]>
                                                                                               Supplementary Exercises           121

11. Prove or disprove each of the following, where p, g, andr       15. Suppose two opposite corner squares are removed from an
are any statements.                                                 8 X 8 chessboard  — as in part (a) of Fig. 2.4. Can the remaining
                                                                    62 squares be covered by 31 dominos (rectangles consisting of
    a) [(p¥q)
            Yr] [pY¥ @XYr)]
                                                                    two adjacent squares — one white and the other blue, as shown
    b) [PY G@>r)]
               = [(pY¢@) > (pYr)]                                   in the figure)? (When a domino is placed on the chessboard, a
12. Write the following argument in symbolic form. Then ei-         square of a given color need not be placed on a square of the
ther establish the validity of the argument or provide a counter-   same color.)
example to show that it is invalid.

If it is cool this Friday, then Craig will wear his
       suede jacket if the pockets are mended. The fore-
       cast for Friday calls for cool weather, but the pock-
       ets have not been mended. Therefore Craig won't
       be wearing his suede jacket this Friday.

13. Consider the open statement

p(x. yt   yx    =ytx?

where the universe for each of the variables x, y comprises all                      i                  TPLu
integers. Determine the truth value for each of the following               (a)                         (b)
statements.
                                                                        Figure 2.4
     a) p(0, 0)                       b) p(l, 1)
     c) p(O, 1)                       d) Vy p(0, y)                 16. In part (b) of Fig. 2.4 we have an 8 X 8 chessboard where
     e) dy pl, y)                     f) Wx Ay p(x, y)              two squares (one blue and one white) have been removed from
     g) dy Vx p(x, y)                 h) Vy Ax p(x, y)              each of two opposite corners. Can the remaining 60 squares be
                                                                    covered by 15 T-shaped figures (of three white squares and one
14. Determine whether each of the following statements is true
                                                                    blue one, or three blue squares and one white one—     as shown
or false. If false, provide a counterexample. The universe com-
                                                                    in the figure)? [The reader may wish to verify that a 4 x 4
prises all integers.
                                                                    chessboard (of all 16 squares) can be covered by four of the
    a) Vx dy Az (x = 7y + 5z)                                       T-shaped figures. Then it follows that an 8 X 8 chessboard (of
    b) Vx dy dz (x = 4y + 62)                                       all 64 squares) can be covered by 16 of the T-shaped figures.}
       Set Theory

Urns            the mathematics we study in algebra, geometry, combinatorics, probabil-
                       ity, and almost every other area of contemporary mathematics is the notion of a set.
                  Very often this concept provides an underlying structure for a concise formulation of the
                  mathematical topic being investigated. Consequently, many books on mathematics have
                  an introductory chapter on set theory or mention in an appendix those parts of the theory
                  that are needed in the text. Here it may appear that, in opening the book with a chapter
                  on fundamentals of counting, we have neglected set theory. Actually we have relied on
                  intuition; each time the word collection appeared in Chapter 1, we were dealing with a set.
                  Also, in Sections 2.4 and 2.5, the notion of a set (if not the term itself) was invoked when
                  we dealt with the universe (of discourse) for an open statement.
                      Trying to define a set is rather difficult and often results in the circular use of such
                  synonyms as “class,” “collection,” and “aggregate.” When we first began the study of
                  geometry, we used our intuition to grasp the ideas of point, line, and incidence. Then we
                  started to define new terms and prove theorems, relying on these intuitive notions along
                  with certain axioms and postulates. In our study of set theory, intuition is invoked once
                  again, this time for the comparable ideas of element, set, and membership.
                      We shall find that the ideas we developed in Chapter 2 on logic are closely tied to set
                  theory. Furthermore, many of the proofs we shall study in this chapter draw on the ideas
                  developed in Chapter 2.

3.1
         Sets and Subsets
                  We have a “gut feeling” that a set should be a well-defined collection of objects. These
                  objects are called elements and are said to be members of the set.
                      The adjective well-defined implies that for any element we care to consider, we are able
                  to determine whether it is in the set under scrutiny. Consequently, we avoid dealing with
                  sets that depend on opinion, such as the set of outstanding major league pitchers for the
                   1990s.
                      We use capital letters, such as A, B, C,...,    to represent sets and lowercase letters to
                  represent elements. For a set A we write x € A if x is an element of A; y ¢ A indicates that
                  y is not a member of A.

A set can be designated by listing its elements within set braces. For example, if A is the set
EXAMPLE 3.1
                  consisting of the first five positive integers, then we write A = {1, 2, 3, 4, 5}. Here2€     A
                  but 6 ¢ A.

123
124          Chapter 3 Set Theory

Another standard notation for this set provides us with A = {x|x is an integer and 1 <
                              x <5}. Here the vertical line | within the set braces is read “such that.” The symbols {x| . . .}
                              are read “the set of all x such that. .. .” The properties following | help us determine the
                              elements of the set that is being described.
                                  Beware! The notation {x|1 <x <5} is not an adequate description of the set A unless
                              we have agreed in advance that the elements we are considering are integers. When such an
                              agreement is adopted, we say that we are specifying a universe, or universe of discourse,
                              which is usually denoted by U. We then select only elements from U to form our sets. In this
                              particular problem, if    denotes the set of all integers or the set of all positive integers, then
                              {x|1 <x <5} adequately describes A. If U is the set of all real numbers, then {x|1 < x <5}
                              would contain all of the real numbers between | and 5 inclusive; if U consists of only even
                              integers, then the only members of {x|1 <x <5} would be 2 and 4.

For U = {1, 2, 3, ...}, the set of positive integers, we consider the following sets. At the
      EXAMPLE 3.2
                              same time we introduce various notations one may use to describe such sets.

a) A= {1,4,9,..., 64, 81} = {x7|x €U, x* < 100} = {x?|x EUA x? < 100}
                                b) B = (1,4, 9, 16} = {y?|y © U, y? < 20} = {y’|y EU, y? < 23}
                                     = {y*|y ©UA y* < 16}.
                                c) C = (2,4, 6,8,...) = (2k|k EU}.
                                  Sets A and B are examples of finite sets, whereas C is an infinite set. When dealing with
                              sets like A or C, we can either describe the sets in terms of properties the elements must
                              satisfy or list enough elements to indicate what is, we hope, an obvious pattern. For any
                              finite set A, |A| denotes the number of elements in A and is referred to as the cardinality,
                              or size, of A. In this example we find that |A| = 9 and | B| = 4.
                                  Here the sets B and A are such that every element of B is also an element of A. This
                              important relationship occurs throughout set theory and its applications, and it leads to the
                              following definition.

Definition 3.1          If C, D are sets from a universe U, we say that C is a subset of D and write C C D, or
                              D2      C, if every element of C is an element of D. If, in addition, D contains an element that
                              1s not in C, then C is called a proper subset of D, and this is denoted by C C Dor DDC.

Note that for all sets C, D from a universe ‘U, if C C D, then

Vxl[xeC>xe€DI,
                              and if Vx [x¢€C 3x e€ D],thenC CD.
                                  Here the universal quantifier Vx indicates that we should have to consider every element
                              x in the prescribed universe U. However, for each replacement c (from °U) where the
                              statement c € C is false, we know that the implication c € C + c € D is true, regardless of
                              the truth value of the statement c € D. Consequently, we actually need to consider only those
                              replacements c’ (from UL) where the statement c’ € C is true. If for each such c’ we find that
                              the statement c’ € D is also true, then we know that Vx [x € C > x € D] or, equivalently,
                              CCD.
                                    Also, we find that for all subsets C, D of U,

CCDSCCD,
                                                                                     3.1   Sets and Subsets    125

and when C, D are finite,

CCDS        (|C\|<|D|,     and    CCDS
                                                                         |C| <|DI.

However, for U = {1, 2, 3, 4, 5}, C = {1, 2}, and D = {1, 2}, we see that C is a subset of
                 D (that is, C C D), but it is not a proper subset of D (or, C ¢ D). So, in general, we do not
                 findtha    CC  CD>CCD.

In an early version of ANSI (American National Standards Institute) FORTRAN, no distinc-
EXAMPLE 3.3      tion was made between uppercase and lowercase letters, and a variable name consisted of a
                 single letter followed by at most five characters (letters or digits). If U denotes the set of all
                 such variable names, then by the rules of sum and product, || = 26 + 26(36) + 26(36)? +
                 -++ + 26(36)° = 26 5°°_, 36' = 1,617,038,306. Thus, % is large, but still finite. An integer
                 variable in this programming language had to start with one of the letters I, J, K, L, M, N.
                 So if A denotes the subset of all integer variables in this early version of ANS] FORTRAN,
                 then |A| = 6 + 6(36) + 6(36)* + --- + 6(36)° = 6 )>_, 36' = 373,162,686.

The subset concept may now be used to develop the idea of set equality. First we consider
                 the following example.

For the universe U = {1, 2, 3, 4, 5}, consider the set A = {1, 2}. If B = {x|x? € U}, then
EXAMPLE 3.4
                 the members of B are 1, 2. Here A and B contain the same elements          — and no other
                 element(s) — leading us to feel that the sets A and B are equal.
                    However,   it is also true here that A C B and B C A, and we prefer to formally define
                 the idea of set equality by using these subset relations.

Definition 3.2   For a given universe U, the sets C and D (taken from UW) are said to be equal, and we write
                 C=D,whenC         CDand      DCC.

From these ideas on set equality, we find that neither order nor repetition is relevant fora
                 general set. Consequently, we find, for example, that {1, 2, 3} = {3, 1, 2} = {2, 2, 1, 3} =
                 {1, 2, 1, 3, 1}.

Now that we have defined the concepts of subset and set equality, we shall use the
                 quantifiers of Section 2.4 to examine the negations of these ideas.
                    For a given universe °U, let A, B be sets taken from U. Then we may write
                                                ACBSe
                                                  Vx [xe ASxe BI.

From the (quantified) definition of A C B, we find that

A ¢ B (that is, A is not a subset of B)
                                                       Wx     [xe Asaxe
                                                                      B
                                                  2 dx-7fxeAs>xe
                                                               B]
                                                 <>    Ax ->[-(x € A) Vx     e€ B]

<=   dx[xeAA7(€
                                                                B)]
                                                  <= dx[xcAAx
                                                            ¢€ B].
126        Chapter 3 Set Theory

Hence A ¢ B if there is at least one element x in the universe where x is a member of A
                            but x is not a member of B.
                                  In a similar way, because A = B+          ACBABCA,        then

AFB      AACBABCA)
                                              SS 7(ACB)V-~>BCA)SALBVBEA.
                            Therefore two sets A and B are not equal if and only if (1) there exists at least one element
                            x in U where x € A but x ¢ B or (2) there exists at least one element y in WU where y € B
                            and y ¢ A—or perhaps both (1) and (2) occur.
                                  We also note that for any sets C, D CU (that is, CCU        and DCU),
                                                               CCODESCCDAC#D.
                                Now that we have introduced the four ideas of set membership, set equality, subset, and
                            proper subset, we shall consider one more example to see what these concepts tell us, as
                            well as what they do not tell us. Following this example, the proof of our first theorem for
                            this chapter will be fairly straightforward — because it readily follows from some of these
                            ideas.

Let U = {1, 2, 3, 4,5, 6, x, y, {1, 2}, {1, 2, 3}, {1, 2, 3, 4}} (where x, y are the 24th, 25th
      EXAMPLE 5.5          lowercase letters of the alphabet and do not represent anything else, such as 3, 5, or {1, 2}).
                           Then || = 11.

a) IfA = {1, 2, 3, 4}, then |A| = 4 and here we have
                                   i) ACY;                    ii) ACU;                         iii)   AE U;
                                 iv) {A} CU;                  v) {A} CU; but                   vi)    {A} ¢U.
                              b) Now let B = {5, 6, x, y, A} = {5, 6, x, y, {1, 2, 3, 4}}. Then |B| = 5, nor 8. And now
                                 we find that
                                     i) ACB;                       ii)   {A} C B; and          iii)   {A} CB.
                                    But
                                    iv) {A} ¢ B;
                                     v) AZ B (that is, A is not a subset of B); and
                                    vi)   A ¢ B (that is, A is not a proper subset of B).

THEOREM    3.1              Let A, B,C CU.

a     IfAC Band       BCC,thenA
                                                           CC.                   b) IfAC BandBCC,thenACC.
                              a IfACBandBCC,then
                                             ACC.                                d) If AC Band BCC,thenA
                                                                                                      CC.

Before we prove this theorem we want to recall acomment we made back in Section 2.5. It
                            concerns our coverage of the Rules of Universal Specification and Universal Generalization
                            and appears after Example 2.56. For now it is appropriate in this new area on set theory.
                            When we want to prove, for example, thatx ¢€ A = x € C, we shall start by considering any
                            fixed but arbitrarily chosen element x in % — but we shall want this element x to be such
                            that “x € A” is a true statement (not an open statement). Then we must show that this same
                            fixed but arbitrarily chosen element x is also in C. The proofs we present are consequently
                            referred to as element arguments. Always remember that in these proofs x represents a fixed
                            but arbitrarily chosen element of A — and though x is generic (since it is not a specifically
                            named element in A), it does remain the same throughout each proof.
                                                                                            3.1   Sets and Subsets            127

Proof: We shall prove parts (a) and (b) and leave the remaining parts for the exercises.

a) To prove that A C C, we need to verify that for allx € ‘U, if x € A then x € C. We start
                              with anelementx from A.SinceA C B, x € Aimpliesx               € B.ThenwithB           CC,x     eB
                              implies x € C.Sox € A implies x € C (by the Law of the Syllogism — Rule 2 in Table
                              2.19— since x € A, x € B, and x € C are statements), and A CC.
                        b) Since A C B,ifx € Athenx € B. With B CC, it then follows thatx € C,so A CC.
                           However, A C B => there exists an element b € B such that b ¢ A. Because B CC,
                              be   B>beC.      Thus   ACC and       there   exists   an   element    b€ C     with    b¢ A,     so
                              ACC.

Our next example involves several subset relations.

LetU = {1, 2, 3, 4, 5} with A = {1, 2, 3}, B = {3, 4}, and C = {1, 2, 3, 4}. Then the fol-
   EXAMPLE 3.6
                     lowing subset relations hold:

a) ACC                                              b) ACC
                        ce) BCC                                             d) ACA
                        e) BZA
                        f) A GA      (that is, A is not a proper subset of A)

The sets A, B are just two of the subsets of C. We are interested in determining how
                     many subsets C has in total. Before answering, however, we need to introduce the set with
                     no members.

Definition 3.3   The null set, or empty set, is the (unique) set containing no elements. It is denoted by               or { }.

We note that |@| = 0 but {0} 4 ¥. Also, J # {4} because {4} is a set with one element,
                     namely, the null set.

The empty set satisfies the following property given in Theorem 3.2. To establish this
                     property we use the method of proof by contradiction (or reductio ad absurdum). Following
                     the proof of Theorem 2.4 (in Section 2.5), we said that in establishing a theorem by this
                     method, we assumed the negation of the result and arrived at a contradiction. In our prior
                     work (as found in Example 2.32 and the third proof of Theorem 2.4), we arrived at a
                     contradiction of the formr A —r or p(m) A —p(m), respectively — where —r was a premise
                     in Example 2.32 and p(m) a specific instance of the hypothesis in Theorem 2.4. In proving
                     Theorem 3.2 things are now a little different. This time we shall find ourselves denying (or
                     contradicting) an earlier result we have accepted as true, namely, the definition of the null
                     set.

THEOREM 3.2          For any universe U, let A CU. Then @ C A, andif A # Y, then J C A.
                     Proof: If the first result is not true, then # ¢ A, so there is an element x from the universe
                     with x € B but x ¢ A. But x € # is impossible. So we reject the assumption @ & A and find
                     that 4 C A. In addition, if A # Y, then there is an element a € A (anda ¢%), soWCA.
128          Chapter 3 Set Theory

Returning now to Example 3.6 we determine the number of subsets of the set C = {1, 2,
| EXAMPLE 3.7                 3, 4}. In constructing a subset of C, we have, for each member x of C, two distinct choices:
                              Either include it in the subset or exclude it. Consequently, there are 2 X 2 X 2 X 2 choices,
                              resulting in 2+ = 16 subsets of C. These include the empty set # and the set C itself. Should
                              we need the number of subsets of two elements from C, the result is the number of ways
                              two objects can be selected from a set of four objects, namely, C(4, 2) or (5).                 As a result,
                              the total number of subsets of C, 2*, is also the sum (5) + (7) + (3) + (3) + (@), where the
                              first summand is for the empty set, the second summand for the four singleton subsets, the
                              third summand for the six subsets of size 2, and so on. So 2* =                    }of_4 (2).

Definition 3.4          If A is a set from universe U, the power set of A, denoted (A),” is the collection (or set)
                              of all subsets of A.

For the set C of Example 3.7, P(C) = {@, {1}, {2}, {3}, {4}, (1, 2}, (1, 3}, (1, 4, (2, 3},
      EXAMPLE 3.8
                              {2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, (1, 3, 4}, (2, 3, 4}, C}.

For any finite set A with |A| = n > 0, we find that A has 2” subsets and that |P(A){ = 2”.
                                For any 0< k <n, there are (7) subsets of size k. Counting the subsets of A according
                                to the number, k, of elements in a subset, we have the combinatorial identity

(6) + G) + G+--+(=                     exo)      = 2". forn > 0.

This identity was established earlier in Corollary 1.1 (a). The presentation here is another
                              example of a combinatorial proof because the identity is established by counting the same
                              collection of objects (subsets of A) in two different ways.

A systematic way to represent the subsets of a given nonempty set can be accomplished
                              by using a coding scheme known as a Gray code. This is demonstrated in our next example.

|     EXAMPLE 3.9             Consider the binary strings (of 0’s and 1’s) in Fig. 3.1. In particular, examine the first column
                              of the strings in part (b). How        did this column come about? First we see 0, then 1 as
                                                                                                                        —               in
                              part (a) of the figure. Then we see 1 followed by 0 — the reverse order (from bottom to top)
                              of the two binary strings in part (a). Once we obtain the first column for the binary strings
                              in part (b), we then list two 0’s followed by two 1’s.
                                  Continuing with the strings in part (c) of the figure, now we concentrate on the first two
                              columns. The first four entries (binary strings of length 2) are precisely the four strings
                              in part (b). The last four entries (again, binary strings of length 2) are likewise the binary
                              strings in part (b)— now in reverse order (from bottom to top). For these eight strings of
                              length 2, we append 0 to the right of the first four and | to the nght of the last four.
                                  For each Gray code in parts (a), (b), (c) of the figure, as we go from one binary string (in
                              a column) to the next binary string (in that column), there is exactly one bit that changes.
                              For instance, in part (b), in going from 10 to 11, we find one change (from 0 to 1) in the
                              second position. Furthermore, for the third and fourth strings in part (c), as we go from

"In some computer science textbooks the reader may find the notation 24 used for P(A).
                                                                                                   3.1 Sets and Subsets              129

g                     00|0                 g                    000                      000                000
                  x)                     10 | 0              {x}                   100                      010                001
  (a)                                    11      0         {x, y}                  110                      011                101
                                         0110               ty)                    010                      001                100
        0| o       g                     o1l 1             yz                      O11                      101                110
        110       {x}                    1141             ix,y, 2}                  11                      111                010
        1    1   {x, y}                  10      1         {x, z}                  101                      110                011
        0}   1    ty}                    oo | 1              tz]                   001                      100                14
  (b)                          (c)                                       (d)                          (e)                (f)

Figure 3.1

110 to O10, there is exactly one change — from | to 0 in the first position. The fourth and
                   fifth strings have the one change from 0 to ] —this time in the third position. Also notice
                   how the first and last strings for each code differ in the last position. Part (d) of the figure
                   demonstrates this for the strings of length 3.
                       This technique, for constructing a Gray code for the strings of length 2 from those of
                   length | and the strings of length 3 from those of length 2, is an example of a recursive
                   construction. (This idea will be examined in more detail] in Section 4.2.)
                       When we examine each Gray code in parts (a), (b), (c) of Fig. 3.1, we see a listing of
                   subsets to the right of each of these codes. For example, in part (b), if we start with the set
                   A = {x, y} and keep the order of the elements fixed,’ then we can list the subsets of A in
                   terms of binary strings of length 2. We write 0 for an element when it is not in the subset and
                   1 when it is. Hence the subset {x} is encoded as 10 because the “first” element x (of ordered
                   set A) is in the subset, while the “second” element y (of ordered set A) is not present — as
                   the 0, in 10, indicates. For part (c), the (ordered) set B = {x, y, z} has its eight subsets listed
                   next to the elements of the Gray code. As we go from one subset to the next (in a given
                   column), we see that there is exactly one change in the makeup of the subset. For instance,
                   in going from {x, y} (110) to {y¥} (010), exactly one element is deleted        — as indicated by
                   the change from | to 0 in the first positions of 110 and 010. Likewise, as we go from {z}
                   (O01) to 4 (000), exactly one element is deleted —the change from | to 0, in the third bits
                   of 001 and 000, indicates this. Examining the change from {y, z} (011) to {x, y, z} (1),
                   we see that one new element is added — here it is x. The change from 0 to 1 as we go from
                   011 to 111 takes this into account.
                       Note that the first four subsets in part (c) are the four subsets in part (b). Further, the last
                   four subsets in part (c) come about from the same four subsets in part (b) —this time in
                   reverse order and with the element z included in each subset.
                       The recursive construction given here shows how we can continue to develop Gray codes
                   for binary strings of longer length. When this coding scheme was introduced— just prior
                   to the start of this example  — we spoke of it as a Gray code, not as the Gray code. Other
                   Gray codes are possible. The code in part (e) of Fig. 3.1 provides a second Gray code for
                   the eight binary strings of length 3. Furthermore, if we no longer require the first and last
                   entries in a code to differ in only one position, then the code in part (f) of Fig. 3.1 would
                   also serve as a Gray code for the eight binary strings of length 3.

“Originally we considered the elements of a set as unordered, so we are making an exception here. In textbooks
                   dealing with data structures, such ordered sets are often referred to as /ists and one finds, for instance, the ordered
                   set {x, y, z} denoted by [x, y, z] or (x, y, z).
130         Chapter 3 Set Theory

The ability to count certain, or all, subsets of a given set provides a second approach for
                             the solution of two of our earlier examples.

EXAMPLE 3.10_|         In Example 1.14, we counted the number of (staircase) paths in the x y-plane from (2, 1) to
                             (7, 4) where each such path is made up of individual steps going one unit to the right (R)
                             or one unit upward (U). Figure 3.2 is the same as Fig. 1.1, where two of the possible paths
                             are indicated.
                                     »

1      2        3       4      5       6   7                           1     2        3        4          5       6   7

(a)                R,U,R,R,U,R,R,U                                   {b)                 U,R,R,R,U,U,R,R

Figure 3.2

The path in Fig. 3.2(a) has its three upward (U) moves located in positions 2, 5, and 8
                             of the list at the bottom of the figure. Consequently, this path determines the three-element
                             subset {2, 5, 8} of the set {1, 2, 3,..., 8}. In Fig. 3.2(b) the path determines the three-
                             element subset {1, 5, 6}. Conversely, if we start, for example, with the subset {1, 3, 7} of
                             {1, 2, 3, ..., 8}, then the path that determines this subset is given by U, R, U, R, R, R,
                             U,R.
                                Consequently, the number of paths sought here equals the number of subsets A of
                                                                        8      8!
                             {1, 2, 3,..., 8}, where               |A| = 3. There         are     (3)         = 3151 = 56 such paths                     (and    subsets),

as we found in Example 1.14.
                                If we had considered the moves R to the right, instead of the upward moves U, we would
                             have found the answer to be the number of subsets B of {1, 2, 3,..., 8}, where |B| =5.
                                          8      8!
                             There are ( :) = 531 = 56 such subsets. (The idea presented here was examined earlier
                             for the result developed in Table 1.4.)

In part (b) of Example 1.37 of Section 1.4 we learned that there are 2° compositions for the
      EXAMPLE 3.11
                             integer 7 — that is, there are 2° ways to write 7 as a sum of one or more positive integers,
                             where the order of the summands is relevant. The result we obtained there used the binomial
                             theorem in conjunction with the answers for seven cases that were summarized in Table 1.9.
                             Now we shall obtain this result in a somewhat different and easier way.
                                First consider the following composition of 7:

Io o+           1       06+06~«<wWTSC COW Hd              YH                          dD                  I
                                                       1                  4                                              1                       +
                                                  Ist plus             2nd plus                 tee                   5th plus                6th plus
                                                   sign                   sign                                           sign                    sign

Here we have seven summands, each of which is 1, and six plus signs.
                                                                                              3.1.     Sets and Subsets           131

For the set {1, 2, 3, 4, 5, 6} there are 2° subsets. But what does this have to do with the
                compositions of 7?
                   Consider a subset of {1, 2, 3, 4, 5, 6}, say {1, 4, 6}. Now form the following composition
               of 7:

Q+1)4+41+4+                 d041                4        042)
                                          J                             1                                1
                                       Ist plus                      4th plus                        6th plus
                                          sign                          sign                            sign

Here the subset {1, 4, 6} indicates that we should place parentheses around the 1’s on either
                side of the first, fourth, and sixth plus signs. This results in the composition

24+142+2.

If the same way we find that the subset         {1, 2, 5, 6} indicates the use of the first, second,
                fifth, and sixth plus signs, giving us

gd+1              41
                                                         + 1+ d+                                 1421)
                                          +             4                             1                 1
                                       Ist plus    2nd plus                        5th plus          6th plus
                                          sign       sign                           sign               sign

or the composition 3 + 1 + 3.
                  Going in reverse we see that the composition | + 1 + 5 comes from

1+14+(+1+14+141)
                and is determined by the subset {3, 4, 5, 6} of {1, 2, 3, 4, 5, 6}. In Table 3.1 we have listed
                six compositions of 7 along with the corresponding subset of {1, 2, 3, 4, 5, 6} that deter-
                mines each of them.

Table 3.1

Composition of 7                              Determining Subset of {1, 2, 3, 4, 5, 6}

(i)       1+14+1+14+141+41                                     (1)                             Yi
                        (ii)       14241414141                                        (11)                            {2}
                       (iii)         1+14+34141                                      (ii1)                          {3, 4}
                       (iv)             24+3+2                                       (iv)                        {1, 3, 4, 6}
                            (v)                   4+3                                 (v)                       {1, 2, 3, 5, 6}
                        (vi)                       7                                 (vi)                    {1, 2, 3, 4, 5, 6}

The examples we have obtained here indicate a correspondence between the composi-
               tions of 7 and the subsets of {1, 2, 3, 4, 5, 6}. Consequently, once again we find that there
               are 2° compositions of 7. In fact, for each positive integer m, there are 2”—! compositions
               of m.

Out next example yields another important combinatorial identity.

For integers n, r withn >r         > 1,
EXAMPLE 3.12

(P)=(+(4)
132         Chapter 3 Set Theory

Although this result can be established algebraically from the definition of (") as
                             n!/(r!(n —r)!), we use a combinatorial approach. Let A = {x, a), do, ..., @,} and con-
                             sider all subsets of A that contain r elements. There are (” > ') such subsets. Each of these
                             falls into exactly one of the following two cases: those subsets that contain the element x
                             and those that do not. To obtain a subset C of A, where x € C and |C| = r, place x in C and
                             then select r — 1 of the elements a), a2, .. . , 4). This can be done in (,” ,) ways. For the
                             other case we want a subset B of A with |B| =r and x ¢ B. So we select r elements from
                             among 41, @2, ..., G,, Which we can do in (") ways. It then follows by the rule of sum that
                             (“Fy = 0) +6").
                                Before we proceed any further let us reconsider the result of Example 3.12, but this time
                             we Shall do it in light of what we learned in Example 3.10.
                                   Once again we let n, r be positive integers where n >r > 1. Then ("{') counts the
                             number     of (staircase)   paths   in the xy-plane        from    (0, 0) to (n+    1—~r,r),    where,    as in
                             Example 3.10, each such path has

(n+1)—-r        horizontal moves of the form (x, y) > (x + 1, y),                   and
                                                    r    vertical moves of the form (x, y) >              (x, y+ 1).

The last edge in each of these (staircase) paths terminates at the point (n + 1 —7, r) and
                             starts at either (1) the point (”n — r, r) or (11) the point (n + 1 —r,r — 1).
                                   In case (i) we have the last edge horizontal, namely,               (n ~ r,r) >     (0 + 1 —,r, 1); the
                             number of (staircase) paths from (0, 0) to (n —r,r) is (“~7)*") = ("). For case (ii) the
                             last edge is vertical, namely, (n + 1 —r,r          — 1) >        (n + 1 — +r, r); the number of (staircase)
                             paths from (0, 0) to(n +1 —r,r —1)is(“T! PF)                = (," 4). Since these two cases
                             exhaust all possibilities and have nothing in common, it follows that

Cr) OC)
                                                                       r           r            r—-l

We now investigate how the identity of Example 3.12 can help us solve Example 1.35,
      EXAMPLE 3.13
                             where we sought the number of nonnegative integer solutions of the inequality xj + x2 +
                             s+ + x6 < 10.
                                   For each   integer k, 0 <k    <9,       the number    of solutions     to x) +x.    +---+.x%6      =k   is
                             (° tee ') = C L “\. So the number of nonnegative integer solutions to x; + x2 +:+- +26 <

()-()-0)-0)--60)
                             10 is

[0-10-06 =O--6
                                   [0)-C-G=6) ™6)-0)-0
                                   [9-O)-@-0) 0-0
                                   [0)-Ol-C)=  0-0) =
                                                                              3.1       Sets and Subsets        133

In Fig. 3.3 we find a part of the useful and interesting array of numbers called Pascal's
EXAMPLE 3.14
               triangle

(n = 0)

(n= 1)

(n = 2)

(n = 3)

(n = 4)

(n = 5)      (3

Figure 3.3

Note that in this partial listing the two triangles shown satisfy the condition that the
               binomial coefficient at the bottom of the inverted triangle is the sum of the other two terms
               in the triangle. This result follows from the identity in Example 3.12.
                  When   we replace each of the binomial     coefficients by its numerical        value, the Pascal
               triangle appears as shown in Fig. 3.4.

(n = 0)                                    1

(n= 1)                               1          1

(n = 2)                        1           2              1

(n = 3)

(n = 4)
                          (n=5)         1

Figure 3.4

There are certain sets of numbers that appear frequently throughout the text. Conse-
               quently, we close this section by assigning them the following designations.

a) Z = the set of integers = {0, 1, —1, 2, —2, 3, —3,...}
                     b) N = the set of nonnegative integers or natural numbers = {0, 1, 2,3, ...}
                     c) Z* = the set of positive integers = {1, 2,3,...} = {x EZ               x > 0}
                     d) Q = the set of rational numbers = {a/b | a,b €Z, b # 0}
                     e) Qt = the set of positive rational numbers = {r ¢ Q| r > 0}
                     f) Q* = the set of nonzero rational numbers
                     g) R = the set of real numbers
134             Chapter 3 Set Theory

h) R* = the set of positive real numbers             5
                                           i) R* = the set of nonzero real numbers
                                          p) C = the set of complex numbers = {x + yi| x,y ER, 7? = —1}
                                          k) C* = the set of nonzero complex.numbers
                                           1) For eachn € Z*, Z, = {0,1,2,...,2—1}
                                          m) For real numbers
                                                         a, b witha < b, fa, bh] = {x ER a<x < dB},
                                             (a, b)={xER{a<x <b}, [a,b) = {x eR}a<x       < d},
                                              (a, b] = {x ER | a <x <b}. The first set is called a closed interval, the second
                                              set an open interval, and the other two sets half-open intervals.

¢) proper subsets of A
                          EXERCISES 3.1
                                                                             d) nonempty proper subsets of A
1. Which of the following sets are equal?                                   e) subsets of A containing three elements
      a) {1, 2, 3}                     b) {3, 2, 1, 3}                       f)   subsets of A containing   1, 2

c) {3, 1, 2, 3}                  d) {1, 2, 2, 3}                       g) subsets of A containing five elements, including 1, 2
  2. Let A = {1, {1}, {2}}. Which of the following statements                h) subsets of A with an even number of elements
are true?                                                                    i) subsets of A with an odd number of elements
      a) lea                           b) {1} eA                        9, a) Ifa set A has 63 proper subsets, what is |A|?
      ce) {I}CA                        d) {{1}} OA                           b) Ifaset B has 64 subsets of odd cardinality, what is | B|?
      e) {2}A                          f) {2} A                              ¢) Generalize the result of part (b)
      8) {{2}} OA                      h) {{2}} CA
                                                                       10. Which of the following sets are nonempty?
  3. For A = {1, 2, {2}}, which of the eight statements in Exer-
cise 2 are true?                                                             a) {x|x €N, 2x +7
                                                                                             = 3}

4. Which of the following statements are true?                              b) {fx € Z[3x+5=9}

a) Hed               b4c¥                   agjgcg                     c) {xjx €Q, x7 +4 =6}
      d) H   {8}            e) Ac   {B}           f) AC {B}                  d) {x €R|x?+4=6}

5. Determine all of the elements in each of the following sets.             e) (x ER] x2 +3x4+3=0}

a) {1+ (-1)"|neN}                                                      f) {x|x €C, x7 + 3x43
                                                                                                 =0}
      b) {2 + (1/n)| n € {1, 2, 3, 5, 7}}                              11. When she is about to leave a restaurant counter, Mrs. Al-
      c) {n> + n?|n € {0, 1, 2, 3, 4}}                                 banese sees that she has one penny, one nickel, one dime, one
                                                                       quarter, and one half-dollar. In how many ways can she leave
6. Consider the following six subsets of Z.
                                                                       some (at least one) of her coins for a tip if (a) there are no re-
      A= {2m+1|meZ}                       B= {2n+3\neZ}                strictions? (b) she wants to have some change left? (c) she wants
      C = {2p—3| pe Z}                    D = {3r+1|reZ}               to leave at least 10 cents?
      E = {3s + 2| s € Z}                 F = {3t —2|t€Z}              12,   LetA = {1, 2, 3, 4, 5, 7, 8, 10, 11, 14, 17, 18}.
Which of the following statements are true and which are false?              a)   How   many subsets of A contain six elements?
      a) A=B               b) A=C                 c) B=C                     b) How many six-element subsets of A contain four even
      d) D=E                e) D=F                f) E=F                     integers and two odd integers?
  7. Let A, B be sets from a universe U. (a) Write a quan-                   ¢) How many subsets of A contain only odd integers?
tified statement to express the proper subset relation A C B.
                                                                       13. Let § = {1, 2, 3,..., 29, 30}. How many subsets A of S
(b) Negate the result in part (a) to determine when A ¢ B.
                                                                       satisfy (a) |A| = 5? (b) |A| = 5 and the smallest element in A
8. For A = {1, 2, 3, 4, 5, 6, 7}, determine the number of             is 5? (c) |A| = 5 and the smallest element in A is less than 5?
      a) subsets of A                                                  14, a) How many subsets of {1, 2, 3, ..., 11} contain at least
      b) nonempty subsets of A                                             one even integer?
                                                                                                               3.1   Sets and Subsets                        135

b) How many subsets of {1, 2, 3, ..., 12} contain at least            20. a) Among the strictly increasing sequences of integers that
    one even integer?                                                         start with 1 and end with 7 are:
    c) Generalize the results of parts (a) and (b).                              i)   1,7       ii) 1,3, 4,7                 iii) 1,2, 4,5, 6,7
15. Give an example of three sets W, X, Y such that W € X                     How many such strictly increasing sequences of integers
and X < Y but W ¢ Y.                                                          Start with 1 and end with 7?

16. Write the next three rows for the Pascal triangle shown in                b) How many strictly increasing sequences of integers start
Fig. 3.4                                                                      with 3 and end with 9?
17. Complete the proof of Theorem 3.1.                                        c) How many strictly increasing sequences of integers start
                                                                              with 1 and end with 37? How many start with 62 and end
18. For sets A, B, C CU,             prove or disprove (with a counter-
                                                                              with 98?
example), the following: If AC B, BZ C, then A ZC.
                                                                              d) Generalize the results in parts (a) through (c).
19. In part (i) of Fig. 3.5 we have the first six rows of Pascal’s
triangle, where a hexagon centered at 4 appears in the last three         21. One quarter of the five-element subsets of {1, 2, 3,..., 7}
rows. If we consider the six numbers (around 4) at the vertices of        contain the element 7. Determine n (> 5).
this hexagon, we find that the two alternating triples — namely,          22. For a given universe U, let ACU                             where A is finite
3, 1, 10 and 1, 5, 6— satisfy 3- 1-10 = 30 = 1-5. 6. Part (41)            with |9(A)| =n. If B CU, how many subsets does B have,
of the figure contains rows 4 through 7 of Pascal’s triangle. Here        if (a) B= AU{x}, where x EU   — A? (b) B=AU {x, y},
we find a hexagon centered at 10, and the alternating triples             where x, ye U-— A? (c) B= AU {x), %,..., x}, where
at the vertices —in this case, 4, 10, 15 and 6, 20, 5 — satisfy           X},X2,...,X%,     € U-— A?
4-10-15     = 600 = 6. 20-5.                                              23. Determine which row of Pascal’s triangle contains three
    a) Conjecture the general result suggested by these two               consecutive entries that are in the ratio 1 : 2: 3.
    examples.                                                             24. Use the recursive technique of Example 3.9 to develop a
    b) Verify the conjecture in part (a).                                 Gray code for the 16 binary strings of length 4. Then list each
                                                                          of the 16 subsets of the ordered set {w, x, y, z} next to its cor-
                                                                          responding binary string.
                                                                          25. Suppose that A contains the elements v, w, x, y, z and no
                                                                          others. If a given Gray code for the 32 subsets of A encodes the
                                                                          ordered set {v, w} as 01100 and the ordered set {x, y} as 10001,
                                                                          write A as the corresponding ordered set.
                             1            2       1                       26. For positive integers n, r show that

1       3                                               (“ere ')            ("*")               (ere)
                                                                                                 =                   +                           +
                                                                                        r                 r                     r—1
                     1       4
                                                                                                     1    n+2            +      n+1          +       n

1       5       10                       1                                                    2                      1              0

4
                                                                                                         n+r                 n+r—1
           (1)                                                                                            n                       n

n+2                   n+l]                 n
                                                                                                     +                   +                   +           .
                                                                                                               n                  n                  n
                             1        3       3   1
                                                                          27. In the original abstract set theory formulated by Georg Can-
                                                                          tor (1845-1918), a set was defined as “any collection into a
                                                                          whole of definite and separate objects of our intuition or our
                                                                          thought.” Unfortunately, in 1901, this definition led Bertrand
                                                                          Russell (1872-1970) to the discovery of a contradiction —a re-
                                                                          sult now known as Russell's paradox       — and this struck at the
                                                                          very heart of the theory of sets. (But since then several ways
                                                                          have been found to define the basic ideas of set theory so that
                                                                          this contradiction no longer comes about.)
                                                                              Russell’s paradox arises when we concern ourselves with
       Figure 3.5                                                         whether a set can be an element of itself. For example, the set
136            Chapter 3 Set Theory

of all positive integers is not a positive integer—or Z* ¢ Z*.           a) Write a computer program (or develop an algorithm) to
But the set of all abstractions is an abstraction.                       generate a random six-element subset of A.
   Now in order to develop the paradox let S be the set of               b) For B = {2, 3,5, 7, 11, 13, 17, 19, 23, 29, 31, 37},
all sets A that are not members of themselves —that is, § =              write a computer program (or develop an algorithm) to gen-
{A|A isasetA A ¢ A}.                                                     erate a random six-element subset of A and then determine
      a) Show that if S € S, then S ZS.                                  whether it is a subset of B.
      b) Show that ifS ¢ S, then S € S.                             29. Let A = {1, 2,3,..., 7}. Write a computer program (or
    The results in parts (a) and (b) show us that we must avoid     develop an algorithm) that lists all the subsets B of A, where
trying to define sets like $. To do so we must restrict the types    |B] = 4.
of elements that can be members of a set. (More about this is       30. Write a computer program (or develop an algorithm) that
mentioned in the Summary and Historical Review in Section           lists all the subsets of {1, 2, 3, ..., 2}, where 1 <n < 10. (The
3.8.)                                                               value of n should be supplied during program execution.)
28. Let A = {1, 2,3,..., 39, 40}.

3.2
Set Operations and the Laws of Set Theory
                                After learning how to count, a student usually faces methods for combining counting num-
                                bers. First this is accomplished through addition. Usually the student’s world of arithmetic
                                revolves about the set Z* (or a subset of Z* that can be spoken and written about, as well
                                as punched out on a hand-held calculator) wherein the addition of two elements from Z*
                                results in a third element of Z*, called the sum. Hence the student can concentrate on addi-
                                tion without having to enlarge his or her arithmetic world beyond Z*. This is also true for
                                the operation of multiplication.
                                    The addition and multiplication of positive integers are said to be closed binary op-
                                 erations on Z*. For example, when we compute a +b, for a,b €Z*, there are two
                                 operands, namely, a and b. Hence the operation is called binary. And since a +b € Zt
                                 when a, b € Z*, we say that the binary operation of addition (on Z*) is closed. The binary
                                 operation of (nonzero) division, however, is not closed for Z* — we find, for example, that
                                 1/2(= 1+2) ¢ Z*, even though 1, 2 € Z*. Yet this operation is closed when we consider
                                the set Q* instead of the set Z*.
                                      We now introduce the following binary operations for sets.

Definition 3.5           For A, B, CU we define the following:

a) AU B (the union of A and B) = {x|x        Ee AV x € B}.
                                    b) AN B (the intersection of A and B) = {x|x Ee AAx          € B}.
                                      c) AA B (the symmetric difference of Aand B) = {x|(x EAVXEB)Ax€ANB)=
                                         {xjxE AUBAXEANB}.

Note that if A, B CU, then AU B, AN B, AA         B CU. Consequently, U, , and A are
                                closed binary operations on P(W), and we may also say that P(U) is closed under these
                                (binary) operations.

WithUW = {1, 2, 3,..., 9, 10}, A = {1, 2, 3, 4, 5}, B = (3, 4, 5, 6, 7}, and C = {7, 8, 9},
      EXAMPLE 3.15
                               we have:

a) AN B = (3,4, 5}                            b) AU B = {1, 2, 3, 4, 5, 6, 7}
                                                             3.2 Set Operations and the Laws of Set Theory              137

c) BNC = {7}                                  d) ANC=B
                       e) AA B= (I, 2,6, 7}                          f) AUC ={1, 2, 3, 4,5, 7, 8, 9}
                       g) AAC ={1, 2,3, 4,5, 7,8, 9}

In Example 3.15 we see that AM BC A CAU B. This result is not special for just this
                     example but is true in general. The result follows because

XEANBS(XEAAXEB)SXEA

(by the Rule of Conjunctive Simplification — Rule 7 of Table 2.19), and
                                              xEAD(XKEAVXEB)D>DSXEAUB

(where the first logical implication is a result of the Rule of Disjunctive Amplification—
                     Rule 8 of Table 2.19),

Motivated by parts (d), (f), and (g) of Example 3.15, we introduce the following general
                     ideas.

Definition 3.6   Let S, T CU. The sets S and 7 are called disjoint, or mutually disjoint, when SO            T = G.

THEOREM 3.3          If S, 7 CU,   then S and 7 are disjoint if and only if SUT      =SAT.
                     Proof: We start with $, 7 disjoint. (To prove that SUT = S AT we use Definition 3.2.
                     In particular, we shall provide two element arguments, one for each inclusion.) Consider
                     each x in UU. If x e SUT,     then x € S or x €T      (or perhaps    both).   But   with   S and     T
                     disjoint, x   SOT so x € S AT. Consequently, because x ¢ SUT implies x e SAT,
                     we have SUT CS AT. For the opposite inclusion, if ye $ AT, then ye S or ye T.
                     (But y ¢ SMT; we don’t actually use this here.) So y¢ SUT. Therefore S AT CSUT.
                     And now that we have SUT CSAT andSATCSUT, it follows from Definition 3.2
                     that SAT    = SUT.
                         We prove the converse by the method of proof by contradiction. To do so we consider
                     any S, T CU and keep the hypothesis (that is, that SU 7 = S A T) as is, but we assume
                     the negation of the conclusion (that is, we assume that S and T are not disjoint). So if
                     SOT    #G@,letxe    SOT.Thenx
                                            € Sandx €7,sox eS UT and

xXxESAT(=SUT).

But when x € SUT     andx   € ST, then

xE¢SAT.

From this contradiction —namely, x € SAT A x ¢ S A T —we realize that our original
                     assumption was incorrect. Consequently, we have S and 7 disjoint.

In proving the first part of Theorem 3.3 we showed that if S$, 7 are any sets, then
                     SAT CSUT. The disjointness of § and T was needed only for the opposite inclusion.

After mastering the skill of addition, one usually comes next to subtraction. Here the set
                     N causes some difficulty. For example, N contains 2 and 5 but 2 — 5 = —3, and -3 €N.
                     Therefore the binary operation of subtraction is not closed for N, although it is closed for
138          Chapter 3 Set Theory

the superset Z of N. So for Z we can introduce the unary, or monary, operation of negation
                              where we take the “minus” or “negative” of a number such as 3, getting —3.
                                  We now introduce a comparable unary operation for sets.

Definition 3.7          Foraset A C U, the complement of A, denotedU — A, or A, is given by {x|x € UA x ¢ A}.

EXAMPLE 3.16                    ,
                              For the sets of Example Example 3.15,
                                                                 3.15,   A = {6, 7, 8,9, 10}, B=
                                                                                              B = {1, 2, 8, 9, 10}, and C=
                                                                                                                        C = {1, 2, 3,
                              4,5, 6, 10}.

For every universe U and every set A CU, we find that A CU. Therefore PAL) is
                              closed under the unary operation defined by the complement.
                                 The following concept is related to the concept of the complement.

Definition 3.8          For A, B CU, the (relative) complement of A in B, denoted B — A, is given by
                              {x|x E BAx         € A}.

EXAMPLE 3.17            For the sets of Example 3.15 we have:

a) B—A= {6,7}                             b) A — B = {1, 2}           c) A~C=A
                                d)C-A=C                                    ey) A-A=G                   f)}U-A=A

In order to motivate our next theorem, we first consider the following.

EXAMPLE 3.18            For UW = R, let A = [1, 2] and B = [1, 3). Then we find that
                                    a) A= {x|l      <x <2) C{x|lL<x<3)=B8B
                                    b) AUB={x|L<x<3}=B8B
                                    QANB={x|l<x<2}=A
                                    d) B = (00, 1) U[3, +00) C (—00, 1) U (2, +00) = A

This next theorem now shows us that the four results in Example 3.18 are related in
                              general. In order to prove this theorem we again make use of Definition 3.2, as we discover
                              the interplay between the notions of subset, union, intersection, and complement.

THEOREM 3.4                   For any universe U and any sets A, B CU, the following statements are equivalent:
                                 a) ACB                                                  b) AUB=B
                                 c) ANB=A                                                d)BCA
                              Proof: In order to prove the theorem, we prove that (a) >               (b), (b) >   (c), (c) >   (d), and
                              (d) => (a). (The reason this suffices to prove this theorem is based on the idea presented in
                              Exercise     13 at the end of Section 2.2.)
                                         3.2 Set Operations and the Laws of Set Theory       139

i) (a)>(b)      IfA, Bare any sets, then B C A U B (as mentioned after Example 3.15).
       For the opposite inclusion, ifx € A U B, then x € A or x € B, but since A C B, in
       either case we have x € B. So AU B C B and, since we now have both inclusions,
       it follows (once again from Definition 3.2) that AU B = B.
   ii}   (b) >   (c)   Given sets A, B, we always have A D> A/ B         (as mentioned after Ex-
         ample 3.15). For the opposite inclusion, let y¢ A. WithAUB=B,ycAS>ye
         AUB > ye B(sinceAU B= B) > ye AN B,soA CAN B and we conclude
         that A= ANB.
   iii) (c) > (d)      We know that ze B >z¢        B. Now if z€ ANB,           then z€ B, since
       AB CB. The contradiction — namely, z ¢ B Az € B —tells us thatz € ANB.
       Therefore, z ¢ A because AN B = A. Butz ¢A>z€A,SOBCA.
   iv) (d)>(a)     Last,weASw颠    A. lfw ¢ B,thenw e B. WithB C A it then follows
         that w € A. This time we get the contradiction w ¢ A Aw        €A,and this tells us that
         wéeB.HenceACB.

With a bit of theorem proving under our belts, we now introduce some of the major laws
that govern set theory. These bear a marked resemblance to the laws of logic given in Section
2.2. In many instances these set theoretic laws are similar to the arithmetic properties of the
real numbers, where “U”’ plays the role of “++” and “M” the role of “X.” However, there are
several differences.

The Laws of Set Theory
  For any sets A, B, and C taken from a universe U

A=A                                           Law of Double Complement
         2) AUB=ANB          -                         DeMorgan’s Laws
           ANB =AuUB
         3AUB=BUA                                      Commutative Laws
           ANB=BNA
         4 AU(BUC)=(AU           BUC                   Associative Laws
           AN(BNC)=(AN           BNC
         S)AU(BNO)=(AUBN(AUC)                          Distributive Laws
           AN(BUC)=(AN           BYU(ANC)
         6} AUA=A                                      idempotent Laws
           ANA=A                          -
         7) AUS=A                                      Identity Laws
            ANUW=A
         8) AUA = %                                    Inverse Laws
           ANA=@
         9) AUU = UY                                   Domination Laws
            ANG=86
     10) AU(ANB=A                                      Absorption Laws
           AN(AUB)=A
140          Chapter 3 Set Theory

All these laws can be established by element arguments, as in the first part of the proof
                              of Theorem 3.3. We demonstrate this by establishing the first of DeMorgan’s Laws and the
                              second Distributive Law, that of intersection over union.
                             Proof: Let x € U. Then
                                                                x€AUB>x¢AUB
                                                                           =>x¢éAandx¢éB
                                                                           >xeAandxeB
                                                                           =>xeEAN     B,

so UB
                                 A         CAN B. Toestablish the opposite inclusion, we check to see that the converse of
                              each logical implication is also a logical implication (that is, that each logical implication
                              is, in fact, a logical equivalence). As a result we find that

x€ANBSxc€AandxeB
                                                                           =>xéAandx¢éB
                                                                           >x€AUB
                                                                           >xeAUB.

Therefore AM B C AU B. Consequently, with A UB CAN                  BandAN BCA         UB, itfol-
                              lows from Definition 3.2 that AUB = ANB.
                                  In our second proof, we shall establish both subset relations simultaneously by using the
                              logical equivalence (<=) as opposed to the logical implications (= and <).
                             Proof: For eachx € U,
                                               XEAN(BUC)
                                                   S&S (EA) and(xe BUC)
                                                                 <>    (x € A) and (x € Borx eC)
                                                                  <=   (xe Aandxe B)or(xe
                                                                                      Aandx eC)
                                                                  SX     EANB)or(xEe
                                                                                  ANC)
                                                                  Sx    E(ANB)U(ANC).

As we have equivalent statements throughout, we have established both subset relations
                              simultaneously, so AM      (BUC)     = (AN   B)U(ANC).          (The equivalence of the third and
                              fourth statements follows from the comparable principle in the laws of logic — namely, the
                              Distributive Law of conjunction over disjunction.)

The reader undoubtedly expects the pairing of the laws in items 2 through 10 to have
                              some importance. As with the laws of logic, these pairs of statements are called duals. One
                              statement can be obtained from the other by replacing all occurrences of U by M and vice
                              versa, and all occurrences of U by 4 and vice versa.
                                    This leads us to the following formal idea.

Definition 3.9          Let s be a (general) statement dealing with the equality of two set expressions. Each such
                              expression may involve one or more occurrences of sets (such as A, A, B, B, etc.), one or
                             more occurrences of % and °U, and only the set operation symbols M and U. The dual of s,
                             denoted s4 is obtained from s by replacing (1) each occurrence of % and U (in s) by U and
                             J, respectively; and (2) each occurrence of N and U (in s) by U and N, respectively.
                                                                3.2 Set Operations and the Laws of Set Theory          141

As in Section 2.2, we shall state and use the following theorem. We shall prove a more
                  general result in Chapter 15.

THEOREM 3.5       The Principle of Duality. Let s denote a theorem dealing with the equality of two set
                  expressions (involving only the set operations M and U as described in Definition 3.9). Then
                  s?, the dual of s, is also a theorem.

Using this principle cuts our work down considerably. For each pair of laws in items
                  2 through 10, one need prove only one of the statements and then invoke this principle to
                  obtain the other statement in the pair.

We must be careful about applying Theorem 3.5. This result cannot be applied to par-
                  ticular situations   but only   to results   (theorems)   about   sets in general.    For example,     let
                  us consider the particular situation where U = {1, 2, 3, 4, 5} and A = {1, 2, 3, 4}, B=
                  {1, 2, 3, 5}, C = {1, 2}, and D = {1, 3}. Under these circumstances

AN B=({1,
                                                           2,3} =CUD.

However,    we   cannot    infer    that   s; AN   B=CUD=s3s4:AUB=CQOD.                       For    here
                  AU B = {1, 2, 3, 4, 5}, whereas C M D = {1}. The reason why Theorem 3.5 is not appli-
                  cable here is that although        AN   B = C UD      in this particular   example,    it is not true in
                  general (that is, for any sets A, B, C, and D taken from a universe VU).

Inasmuch as Definition 3.9 and Theorem 3.5 do not mention anything about subsets, can
   EXAMPLE 3.19
                  we find a dual for the statement A C B (where A, B CU)?
                      Here we get an opportunity to use some of the results in Theorem 3.4. We can deal with
                  the statement A C B by using the equivalent statement A U B = B.
                      The dualofA U B = BgivesusAM B= B.ButAN B= B <> BCA. Consequently,
                  the dual of the statement A C B is the statement B C A. (We could also have obtained this
                  result by using AC     B <> AN B=A.,)

When we consider the relations that may exist among the sets that are involved in a
                  set-equality or subset statement, we can investigate the situation graphically.
                     Named in honor of the English logician John Venn (1834—1923), a Venn diagram is
                  constructed as follows: % is depicted as the interior of a rectangle, while subsets of U
                  are represented by the interiors of circles and other closed curves. Figure 3.6 shows four
                  Venn diagrams. The (blue) shaded region in Fig. 3.6(a) represents the set A, whereas A is
                  represented by the unshaded area. The shaded region in Fig. 3.6(b) comprises A U B; the
                  set A M B is represented by the shaded region in Fig. 3.6(c). The Venn diagram for A — B
                  is given in part (d) of this figure.
                      In Fig. 3.7 Venn diagrams are used to establish the second of DeMorgan’s Laws. Figure
                  3.7(a) has everything except AM B shaded, so the shaded portion represents A 1 B. We
                  now develop a Venn diagram to depict A U B. In Fig. 3.7(b), A is the shaded region (outside
                  the circle representing set A). Likewise, B is the shaded region shown in Fig. 3.7(c). When
                  the results from Fig. 3.7(b) and Fig. 3.7(c) are put together, we get the Venn diagram for
                  their union in Fig. 3.7(d). Since the shaded region in part (d) is the same as that in part (a),
                  it follows that AN B= AUB.
142   Chapter 3 Set Theory

O)

JE Ae
                                      (a)                                          (b)
                                  U                                           AL

Sle =  (c)
                                  Figure 3 6

U                                           U
                                                                                   (d)

(a)                                          (b)

(c)                                            (d)
                                  Figure 3.7

We further illustrate the use of these diagrams by showing that for any sets A, B, C CU,

(AUB)NC=(ANB)UC.
                       Instead of shading regions, another approach that also uses Venn diagrams numbers the
                       regions as shown in Fig. 3.8 where, for example, region 3 is AM BMC and region 7 is
                       AN BNC.              Each region is a set of the form S$; M S21 $3, where   S, is replaced by A or
                       A, Sz by B or B, and S3 by C or C. Consequently, by the rule of product, there are eight
                       possible regions.
                          Consulting Fig. 3.8, we see that A U B comprises regions 2, 3, 5, 6, 7, 8 and that regions
                       4,6, 7, 8 make up set C. Therefore (A U B) MC               comprises the regions common to A U B
                                               3.2 Set Operations and the Laws of Set Theory     143

iS
                                               CN

and C: namely, regions 6, 7, 8. Consequently, (A U B) M C is made up of regions 1, 2, 3, 4,
5. The set A consists of regions 1, 3, 4, 6, while regions 1, 2, 4, 7 make up B. Consequently,
A‘ B comprises regions | and 4. Since regions 4, 6, 7, 8 comprise C, the set C is made
up of regions 1, 2, 3, 5. Taking the union of A M B with C, we then finish with regions 1,
2, 3, 4,5, as we did for (AU          B) NC.

One more technique for establishing set equalities is the membership table. (This method
is akin to using the truth table introduced in Section 2.1.)

We observe that for sets A, B CU,               an element x € U satisfies exactly one of the fol-
lowing four situations:
  a) x¢A,x¢B                                             b) x EA,
                                                                xX EB.
  c)xE€A,x¢B                                             d)xeA,xeB.,

When x is an element of a given set, we write a | in the column representing that set in
the membership table; when x is not in the set, we enter a 0. Table 3.2 gives the membership
tables for AM B, A UB, A in this notation. Here, for example, the third row in part (a) of
the table tells us that when an element x € YU is in set A but not in B, then itis notin A  B
but itis in A U B.

Table 3.2

A|B!lANB|AUB                             A|A

0 |    0       0          0                    ]
                                0      ]       0          1              1     0
                                 1     0       0          ]
                                1       1      1          ]

(a)                                      (b)

These binary operations on 0 and 1 are the same as in ordinary arithmetic (relative to -
and +) except that 1 U1] = 1.
   Using membership tables, we can establish the equality of two sets by comparing their
respective columns in the table. Table 3.3 demonstrates this for the Distributive Law of
union over intersection. We see here how each of the eight rows corresponds with exactly
one of the eight regionsin the Venn diagram of Fig. 3.8. For example, row | corresponds
with region 1: AM BC; and row 6 corresponds with region 7: AN BOC.
144         Chapter 3 Set Theory

Table 3.3

A|B|C                   |]   BNC}   AU(BNC)         |   AUB        {AUC |      (AUB)N(AUC)

0;   0]            0          0             0             0           0                      0
                                      0|;0]               1         0             0             0           1                      0
                                      Oo;    14]     0              0             0              ]          0                      0
                                      0]     1           1           1            1              1          1                       ]
                                       1}   0;       0              0             ]              ]          1                       l
                                       1/0);              1          0            ]              ]           1                     ]
                                      l        1 |       0          0             ]              ]           ]                     ]
                                      1        1         ]          l             ]              ]           l                     1

t                                                ft
                                                                                  Since these columns are identical, we conclude
                                                                                     that AU(BNC)=(AUB)N(AUC),

Before we continue let us make two points. (1) A Venn diagram is simply a graphical
                             representation of a membership table. (2) The use of Venn diagrams and/or membership
                             tables may be appealing, especially to the reader who presently does not appreciate writing
                             proofs. However, neither one of these techniques specifies the logic and reasoning displayed
                             in the element arguments we presented, for instance, to prove that for any A, B, C CU,

AUB=ANB,                 and       AN(BUC)=(ANB)U(ANC).

We feel that Venn diagrams may help us to understand certain mathematical situations
                             — but when the number of sets involved exceeds three, the diagram could be difficult to
                             draw.
                                 In summary, let us agree that the element argument (especially with its detailed explana-
                             tions) is more rigorous than these two techniques and is the preferred method for proving
                             results in set theory.

Now that we have the laws of set theory, what can we do with them? The following
                             examples will demonstrate how the laws are used to simplify a complicated set expression
                             or to derive new set equalities. (When more than one law is used in a given step, we list the
                             principal law as the reason.)

EXAMPLE 3.20           Simplify the expression (A U B) VC UB.

(AUB)NCUB                              Reasons
                                   =((AUB)NC)N B                          DeMorgan’s Law
                                   =((AUB)NC)NB                           Law of Double Complement
                                   =(AUB)N(CNB)                           Associative Law of Intersection
                                   =(AUB)N(BNC)                           Commutative Law of Intersection
                                   =[(AUB)N BINC                          Associative Law of Intersection
                                   =BOC                                   Absorption Law
                                The reader should note the similarity between the steps and reasons in this example and
                             those for simplifying the statement

[“[(p Vg) Ar] Vv 7q]
                                                                                     3.2 Set Operations and the Laws of Set Theory        145

to the statement

qgAr

in Example 2.17.

Express A — B interms of Uand .
L    EXAMPLE     5.21      From          the         definition       of relative   complement,    A — B = {x|x      Ee AAxE     BJ =AN    B.
                        Therefore,

B               ANB              Reasons
                           tl tl >

by doll I
                          m| >| |

U                                DeMorgan’s Law
                                     U                                Law of Double Complement

From the observation made in Example 3.21, we have A A B =
    EXAMPLE 3.22        {x]x € AUBAx
                                  ¢ ANB} = (AUB)
                                             — (AN B) = (AUB) N(ANB), so
                           AAB=(AUB)N(ANB)                                                                Reasons
                           = (AU              B)U(AN             B)                                       DeMorgan’s Law
                           = (AU B)U(ANB)                                                                 Law of Double Complement
                           =(ANB)U(AUB)                                                                   Commutative Law of U
                           =(ANB)U(ANB)                                                                   DeMorgan’s Law
                           =[(AN B)UA]N[(AN B) UB]                                                        Distributive Law of U over M
                           =[(AUA)N(BUA)] N[(AU B) N(BUB)]                                                Distributive Law of U over N
                           =(UN(BUA)]N[(AU BN"                                                            Inverse Law
                           =(BUA)N                      (A U B)                                           Identity Law
                           =(AUB)N(AU B)                                                                  Commutative Law of U
                           =(AUB)N(ANB)                                                                   DeMorgan’s Law
                           =AAB
                           =(AUB)N(AUB)                                                                   Commutative Law of N
                           =(AUB)N(ANB)                                                                   DeMorgan’s Law
                           =AAB

In closing this section we extend the set operations of U and M beyond three sets.

Definition 3.10    Let J be a nonempty set and U a universe. For eachi € J let A; CU. Then J is called an
                        index set (or set of indices), and eachi                      € / is called an index.
                           Under these conditions,

U     A, = {x|x € A,        for at least onei € J},         and
                                                           t€

O) A, = {x|x
                                                                      € A;             foreveryie
                                                                                                TI}.
                                                           ief
    146            Chapter 3 Set Theory

We can rephrase Definition 3.10 by using quantifiers:

x€U A      re
                                                                                            Fie le Ai)                             x€f) Aj <> Wie I(x € Ai)
                                                                                                                                            rE

Then x ¢ UierAi <=} — [di € I(x € A;)] > Wi € I                                                                   € Aj); that is, x ¢ U;<;A; if and
                                    only if x ¢ A; for every index i € J. Similarly, x ¢N,<;A;                                                                    SB A[Wi El (x EAD]                       S&S
                                    di € I(x ¢ A;); that is, x ¢ M;<,A; if and only ifx ¢ A; for at least one indexi                                                                           € I.

If the index set / is the set Z*, we can write

_)8
                                                                                                                A,,                     () A; =A;NAIN:--=

C8
                                             LU         A;     =A,             UA.U---            =                                                                                                A;.
                                            ieZt                                                      i                               ieZt                                                 i

(I

I
| EXAMPLE 3.23                      Let
                                    Ure
                                            J = {3,4,5, 6,7},
                                            Aj     =     U!_3 A;               =   {1,
                                                                                         and
                                                                                         2, 3,
                                                                                                 for each
                                                                                                  seg      7}    =
                                                                                                                      ie/
                                                                                                                       Aj,
                                                                                                                                 let A; = {1,2,3,...,i}
                                                                                                                              whereas            Nie, A;      =   {1, 2,   3}
                                                                                                                                                                                CU=Z*.
                                                                                                                                                                                =    A3.
                                                                                                                                                                                                         Then

Let     U=R                     and        J =R".            If for         each         re R*,     A, =[-r,r],                   then      U,c;A, =R                  and
|         EXAMPLE 3.24
                                    Nrer Ar         =        {0}.

When dealing with generalized unions and intersections, membership tables and Venn
                                    diagrams are unfortunately next to useless, but the rigorous element approach, as demon-
                                    strated in the first part of the proof of Theorem 3.3, is still available.

THEOREM 3.6                         Generalized DeMorgan’s Laws. Let I be an index set where for each i € J, A; CU.                                                                                      Then

a)     U         A;     =     NA;                                                         b)   1)         A;     =   UA,
                                             iel                     ie!                                                              iel                   iel
                                    Proof: We shall prove Theorem 3.6(a) and leave the proof of part (b) for the reader. For each
                                    x EU, x € Ujes Aj <> x €U:-;A;                                             <> x ¢ Aj,          for alli e]                ex       € Aj,        for allieloo
                                    xE      Nye7 Aj.

3. a) Determine the sets A, B where A — B = {1, 3, 7, 11},
ees                                                                                                                     4 = 0,6, 8), and Ane = 14,9)
     1. For W={1,2,3,...,9, 10}                    let   A = fl, 2, 3, 4, 5},                                    b) Determine
                                                                                                                        —     the sets C, D7 where C — D = (1, 2, 4},
    B= {1,2,4,8},C=(1,2,3,5,7},                      and   D={2,4,6,8).                                          D—C = {7, 8}, and C U D = {1, 2, 4, 5, 7, 8, 9}.
    Determine each of the following:                                                                       4. Let A, B, C, D, E C Z be defined as follows:

a) (A UB) ne                        b) AU(BNC)                                                              A = {2n|n € Z}
                                                                                                                               — that is, A is the set of all (integer)
          c) CUD                          d) CND                                                                             multiples of 2;
          e) (AUB)—C                      f) AU(B—C)                                                              B = {3n|n €Z);                            C = {4n|n
                                                                                                                                                                   € Z};
          g) (B-—C)-D                     h) B—(C—D)
                                                                                                                  D=         {6n|n € Z};          and       E = {8n|n € Z}.
          i) (AUB)-(CND)
     2. If A = [0, 3], B = [2, 7), with U = R, determine each of                                                 a) Which of the following statements are true and which
the following:                                                                                                   are false?
          a) ANB                          b) AUB                                                                         ) ECCCA                                     ii) ACCCE
          c) A                            d) AAB                                                                       iii) BCD                                     iv) DCB
          e) A-B                          f) B-A                                                                        vy) DCA                                     vi) DCA
                                                                               3.2 Set Operations and the Laws of Set Theory                                   147

b) Determine each of the following sets.                              b) P(A NB) = P(A) NP(B)

I CNE            ii) BUD            iii) ANB                 14, Use membership tables to establish each of the following:
        iv) BND            v) A              vi) ANE                      a) ANB=AUB                                         b) AUA=A
5. Determine which of the following statements are true and              ec) AU(ANB)=A
which are false.
                                                                          d) (AN B)U(ANC) =(ANB)U(ANC)
    a) Z*< Qt                       b) Z7> CQ
                                                                      15. a) How many rows are needed to construct the membership
    c) Q7 CR                        d) R*cQ
                                                                          table for AN (BUC)N(DUEUF)?
    e) Qt NR* = Qt                  f) Z* UR* = R*
                                                                          b)     How          many      rows are needed to construct the member-
    g) R*NC= Rt                     h) CUR=R                              ship table for a set made up from the sets A,, Az, ..., An,
     i) ONZ=Z                                                             using N, U, and      ?
6. Prove each of the following results without using Venn                c) Given the membership tables for two sets A, B, how
diagrams or membership tables. (Assume a universe UL.)                    can the relation A C B be recognized?
    a) If AC BandC       CD,then     ANC    C BMD and                     d) Use membership tables to determine whether or not
    AUCCBUD.                                                              (AN B)U(BNO)DAUB.
    b) AC Bifand only if ANB = @.                                     16. Provide the justifications (selected from the laws of set
    c) ACB ifand only if AU B = YU.                                   theory) for the steps that are needed to simplify the set
7. Prove or disprove each of the following:                                              (AN B)U[BN(CN D)U(CND))I,
    a) Forsets A,B, CCU,ANC=BNCSA=B.                                  where A, B,C, DCU.
    b) ForsetsA,B,CCU,AUC=BUCSA=B.
                                                                                Steps                                                                   Reasons
    c) Forsets A, B,C CU,                                                       (AN B)U[BN (CN D)U(CND))|
    [ANC       =BNOC)AAVUC=BU0O)JSA=B.                                     =(ANB)V[BN(CN(DUD))]
    d) Forsets, A,B, CCU,AAC=BACSA=B.                                      =(ANB)VU[BN(CNW)]
  8. Using Venn diagrams, investigate the truth or falsity of each         =(AN                B)U(BNC)
of the following, for sets A, B, C CU.                                     =(BNA)U(BNC)
                                                                           =BN(AUC)
    a) AA(BNC)=(AAB)N(AAC)
                                                                      17. Using the laws of set theory, simplify each of the following:
    b) A— (BUC)
              =(A- B)N(A—-C)
                                                                          a) AN(B
                                                                                — A)
    c) AA(BAC)=(AAB)AC
                                                                          b) (ANB) U(ANBNEND)U(ANB)
9. IfA = {a, b, d}, B = {d, x, y],andC       = {x, z}, howmany
proper subsets are there for the set (AN B) UC?         How many          c) (A- B)U(ANB)
for the set AM (B UC)?                                                    d) AUBU(ANBNC)
10. For a given universal set U, each subset A of U satisfies
                                                                      18. For each
                                                                                n € Z*                        let A, = {1,2,3,...,"-                    I,m}. (Here
the idempotent laws of union and intersection. (a) Are there any
                                                                      UW = Z* and the index set J = Z*.) Determine
real numbers that satisfy an idempotent property for addition?
                                                                                      7                  1              m
(That is, can we find any real number(s) x such thatx + x = x?)
                                                                                  U           An    ’   a)    An   3   U     A,   3        and    f) An,
(b) Answer part (a) upon replacing addition by multiplication.                    n=)                   n=)            a=l                        n=l

11. Write the dual statement for each of the following         set-   where m is a fixed positive integer.
theoretic results.                                                    19. Let % = R and let 7 = Z*. For each n € Z*                                        let A, =
    a) %W=(ANB)U(ANB)U(AN B)U(ANB)                                    [—2n, 3n]. Determine each of the following:
    b) A=AN(AUB)                                                          a) Aj                                              b) Ag
    c) AUB = (AN B)U(ANB)U(ANB)                                           c)     A3       —    Aq                            d)       A;   A Ag

d) A=(AUB)N(AUD)                                                              7                                                    7
                                                                          e) Ua,                                             f) (VA,
12. LetA, B CU. UsetheequivalenceA CB             > ANB=A
to show that the dual statement of A C Bisthe statement B C A.
                                                                          g) neZt
                                                                              UA,                                            h) n=l
                                                                                                                                 1A,
13. Prove or disprove each of the following for sets A, B CU.
    a) P(A U       B) = P(A) UP(B)                                    20. Provide the details for the proof of Theorem 3.6(b).
148         Chapter 3 Set Theory

3.3
         Counting and Venn Diagrams
                             With all of the theoretical work and theorem proving we did in the last section, now is a
                             good time to examine some additional counting problems.
                                 For sets A, B from a finite universe U, the following Venn diagrams will help us obtain
                             counting formulas for |A| and |A U B| in terms of ||, |A], |B], and |AM BI.
                                 As Fig. 3.9 demonstrates, AU A = Wand AN A = G,so by the rule of sum, |A| + |A| =
                             |U| or |A| = |U| — |A]. The sets A, B, in Fig. 3.10, have empty intersection, so here the
                             rule of sum leads us to |A U B| = |A| + |B| and necessitates that A, B be finite but does
                             not require any condition on the cardinality of U.

U                                                 Ut
                                          QB)

Figure 3.9                                        Figure 3.10

Turning to the case where A, B are not disjoint, we motivate the formula for |A U B|
                             with the following example.

In a class of 50 college freshmen, 30 are studying C++, 25 are studying Java, and 10 are
      EXAMPLE 3.25
                             studying both languages. How many freshmen are studying either computer language?
                                We let °tt be the class of 50 freshmen, A the subset of those students studying C++, and
                             B the subset of those studying Java. To answer the question, we need |A U B]. In Fig. 3.11
                             the numbers in the regions are obtained from the given information: |A| = 30, |B| = 25,
                             |A 1 B| = 10. Consequently, |A U B| = 45 # 55 = 30+ 25 = |A| + |B], because |A| +
                             |B| counts the students in AM B twice. To remedy this overcount, we subtract |A M B|
                             from |A| + |B| to obtain the correct formula: |A U B| = |A| +|B| —|AM BI.

Ove

Figure 3.11

If A and B are finite sets, then [A U Bi = |A| + |B] — {AM BI. Consequently, finite
                                sets A and B are (mutually) disjoint if and only if {AU B} = |A/+ Bl. -
                                    In addition, when UL is finite, from DeMorgan’s Law we have [AM B] = {AU B| =
                                [UU] —|AU Bi = {UL} — [A] ~{BI+ {AN BI.
                                                                     3.3 Counting and Venn Diagrams      149

This situation extends to three sets, as the following example illustrates.

An AND gate in an ASIC (Application Specific Integrated Circuit) has two inputs: I,, In,
EXAMPLE 3.26
               and one output: O. (See Fig. 3.12). Such an AND gate can have any or all of the following
               defects:

D;:     The input I; is stuck at 0.
                  D2:     The input I, is stuck at 0.
                  D3:     The output O is stuck at 1.

OU
                             I,                                            A          B

(s\
                                                                                            43

Cc
                                             0
                              Figure 3.12                    Figure 3.13

For a sample of 100 such gates we let A, B, and C be the subsets (of these 100 gates) hav-
               ing defects D,, D2, and D3, respectively. With      |A| = 23, |B| = 26, |C| = 30, |AN B| =
               7, |ANC|     = 8, |B     C| = 10, and |AN BN C| = 3, how many gates in the sample have
               at least one of the defects D;, D>, D3?
                   Working backward from |A M1 BM C| = 3 to |A| = 23, we label the regions as shown in
               Fig. 3.13 and find that |A U BUC| =|A|+|B]/+|C|—|AN BJ —|ANC|—-|BNC]+
               IAN BNC| = 23+ 26+ 30-7 -—8 —10+3 = 57. Thus the sample contains 57 AND
               gates with at least one of the defects and 100 — 57 = 43 AND gates with no defect.

if A, B, C are finite sets, then |A U BUC] = |A]+/B] + [Cl—-|AN B] -—|ANC|—
                 IBONCI+FIANBNC.
                     From the formula for |A U B U C} and DeMorgan’s Law, we find that if the universe
                 UW is finite, then [AN BNAC|={|AUBUC|
                                              = [U] —|AUBUC] = |U] — JA} —
                 [BE ~IC/ + {AN BI+JANC|+IBNC|-jJAN
                                                BNC.

We close this section with a problem that uses this last result.

A student visits an arcade each day after school and plays one game of either Laser Man,
EXAMPLE 3.27
               Millipede, or Space Conquerors. In how many ways can he play one game each day so that
               he plays each of the three types at least once during a given school week?
                  Here there is a slight twist. The set U consists of all arrangements of size 5 taken from
               the set of three games, with repetitions allowed. The set A represents the subset of all
               sequences of five games played during the week without playing Laser Man. The sets B
               and C are defined similarly, leaving out Millipede and Space Conquerors, respectively.
               The enumeration techniques of Chapter 1 give |U| = 3°, |A] = |B] =|C| = 25, |AN B] =
150           Chapter 3 Set Theory

JAN       C)=|BOC|=        1° = 1land|AN BNC|        =0, so by the preceding formula there are
                               IAN BNC| =3°? —3-2°+3- 15 —0 = 150 ways the student can select his daily games
                               during a school week and play each type of game at least once.

This example can be expressed in an equivalent distribution form, since we are seeking
                               the number of ways to distribute five distinct objects (Monday, Tuesday, . .., Friday) among
                               three distinct containers (the computer games) with no container left empty. More will be
                               said about this in Chapter 5.

The following data are the numbers of books that contain ma-
                                                                    terial on these topics:

1. During freshman orientation at a small liberal arts college,         |A| =8                  |B] = 13          IC|= 13
two showings of the latest James Bond movie were presented.               IAN B|=5                JANC]=3            |BNC|=6
Among the 600 freshmen, 80 attended the first showing and 125
                                                                          IANBNC|=2
attended the second showing, while 450 didn’t make it to either
showing. How many of the 600 freshmen attended twice?               (a) How many of the textbooks include material on exactly one
                                                                    of these topics? (b) How many do not deal with any of the
2. A manufacturer of 2000 automobile batteries is concerned
                                                                    topics? (c) How many have no material on compilers?
about defective terminals and defective plates. If 1920 of her
batteries have neither defect, 60 have defective plates, and 20      7. How many permutations of the 26 different letters of the
have both defects, how many batteries have defective terminals?     alphabet contain (a) either the pattern “OUT” or the pattern
                                                                    “DIG’’? (b) neither the pattern “MAN” nor the pattern “ANT”?
3. A binary string of length 12 is made up of 12 bits (that is,
12 symbols, each of which is a0 ora 1). How many such strings         8. A six-character variable name in a certain version of ANSI
either start with three 1’s or end in four 0’s?                     FORTRAN starts with a letter of the alphabet. Each of the other
                                                                    five characters can be either a letter or a digit. (Repetitions are
  4. Determine |A UU BUC] when |A| = 50, |B! = 500, and
                                                                    allowed.) How many six-character variable names contain the
|C| = 5000, if(a@ AC BCC (b)ANB=ANC=BNC=
                                                                    pattern “FUN” or the pattern “TIP”?
%; and(c) |AN B| = |ANC|=|BNOC| =3 and
IAN BNAC|=L.                                                         9. How many arrangements of the letters in MISCELLA-
                                                                    NEOUS have no pair of consecutive identical letters?
  5. How many permutations of the digits 0, 1, 2,..., 9 either
start with a 3 or end with a 7?                                     10. How many arrangements of the letters in CHEMIST have
                                                                    H before E, or E before T, or T before M? (Here “before” means
6. A professor has two dozen introductory textbooks on com-
                                                                    anywhere before, not just immediately before.)
puter science and is concerned about their coverage of the topics
(A) compilers, (8) data structures, and (C) operating systems.

3.4
           A First Word on Probability
                               When one performs an experiment such as tossing a single fair coin, rolling a single fair
                               die, or selecting two students at random from a class of 20 to work on a project, a set of all
                               possible outcomes for each situation is called a sample space. Consequently, {H, T} serves
                               as a sample space for the first experiment mentioned and {1, 2, 3, 4, 5, 6} is a sample space
                               for the roll of a single fair die. Moreover, {{a;, a;}|1 <i <20, 1 < j <20,i 4 j} can be
                               used for the last experiment, with a; denoting the ith student, for each 1 <i           < 20.
                                   In dealing with the sample space & = {1, 2, 3, 4, 5, 6} for the roll of a single fair die, we
                               feel that each of the six possible outcomes has the same, or equal, likelihood of occurrence.
                               Using this assumption of equal likelihood, we shall start our study of probability theory with
                               a definition for probability that was first given by the French mathematician Pierre-Simon
                               de Laplace (1749-1827) in his Analytic Theory of Probability.
                                                                      3.4 A First Word on Probability        151

Under the assumption of equal likelihood, let ¥ be the sample space for an experiment
                €. Each subset A of Y, including the empty subset, is called an event. Each element of
                 f determines an outcome, so if |F| =n andae SF, ACY, then

Pr({a}) = The probability that {a} (or, a) occurs = tah = A and
                       Pr{A) = The probability thatA eccurs = ist = iat

[Note: We often write Pr(a) for Pr({a}).]

We demonstrate these ideas in the following four examples.

When Daphne tosses a fair coin, what is the probability she gets a head? Here the sample
EXAMPLE 3.28
               space F = {H, T} with A = {H} and we find that

|A|       1
                                                    Pr(A) =          2

If Dillon rolls a fair die, what is the probability he gets (a) a 5 or a 6, (b) an even number?
EXAMPLE 3.29
               For either part the sample space& = {1, 2, 3, 4, 5, 6}. In part (a) we have event A = {5, 6}
               and Pr(A) = lal — 2 = i, For part (b) we consider event B = (2, 4, 6} and find that
                  BI =e"i = 3
               Pr(B)=                      :
                  Furthermore we also notice here that

i)    Pr(¥) = Fl = 8 = ] — after all, the occurrence of the event & is a certainty; and

There are 20 students enrolled in Mrs. Arnold’s fourth-grade class. Hence, if she wants to
EXAMPLE 3.30
               select two of her students, at random, to take care of the class rabbit, she may         make her
               selection in (4) = 190 ways, so |F| = 190.
                  Now suppose that Kyle and Kody are two of the 20 students in the class and we let A
               be the event that Kyle is one of the students selected and B be the event that the selection
               includes Kody. Consequently, upon choosing the students, at random, the probability that
               Mrs. Arnold selects

a) both Kyle and Kody is Pr(A 9 B) = (3)/(2) = 1/190;
                 b) neither Kyle nor Kody is Pr(A M B) = ('3)/(?) = 153/190;
                 c) Kyle but not Kody is Pr(A        B) = (1) ('8)/(@) = 18/190 = 9/95.

Consider drawing five cards from a standard deck of 52 cards. This can be done in (°?) =
EXAMPLE 3.31
               2,598,960 ways. Now suppose that Tanya draws five cards, at random, from a standard
               deck. What is the probability she gets (a) three aces and two jacks; (b) three aces and a pair;
               (c) a full house (that is, three of one kind and a pair)?
152          Chapter 3 Set Theory

In all three cases we have |¥| = 2,598,960.

a) There are (3) = 4 ways in which one can select three aces and (3) = 6 ways in which
                                      two jacks may be selected. Consequently, if A is the event where Tanya draws three
                                      aces and two jacks, then |A| = (3)($) =4-6 = 24 and Pr(A) = 24/2,598,960 =
                                      0.000009234.
                                b) Once again there are (3)           = 4 ways to select the aces, and there are (5) = 6 ways to
                                      select a pair of deuces, or a pair of threes, .. ., or a pair of tens, or a pair of jacks, ...,
                                      or a pair of kings. So the pair can be selected in ('7)(3) = 12-6 = 72 ways. If B is
                                      the event where three aces and a pair are drawn, then Pr(B) = (4- 72)/2,598,960 =
                                      0.0001108 14.
                                c) From part (b) we know there are 4-72 = 288 full houses with three aces. Likewise,
                                      there are 288 full houses with three deuces, 288 with three threes, ..., and 288 with
                                      three kings. So the probability that Tanya draws a full house is (‘?) (3) (7) (5) / (2)       =
                                      3744/2,598,960 = 0.001440576.
                              If these three probabilities appear on the slim side, consider the chances of Tanya drawing a
                              royal flush — that is, the ten, jack, queen, king, and ace of one given suit. For this five-card
                              hand the probability is only 4/(?) = 4/2,598,960 = 0.000001539.

To study some additional sample spaces we need to introduce the idea of the ordered
                              pair. This arises in the following structure.

Definition 3.11         For sets A, B, the Cartesian product, or cross product, of A and B is denoted by A X B
                              and equals {(a, b)|a € A, b€ B}.

We call the elements of A           <X B ordered pairs. For (a, b), (c,d) € A X B, we have
                              (a, b) = (c, d) if and only ifa =candb=d."

If A={1,2,3} and B= {x, y}, then A X B= {(1, x), , y), 2, x), 2, y), GB, x),
      EXAMPLE 3.32
                              (3, y)} while B X A = {(x, 1), (x, 2), (x, 3), Gy, 1), Gy, 2), (vy, 3)}. Here 1, x) € AX B
                              but (1,x) ¢ B X A, although (x, 1)—€ BX A. So AXB#BXA, but |A X B| =6=
                              2-3 =|A||B| = |B||A| = |B xX Al.

Now let us see how the Cartesian product can arise in a probability problem.

Suppose Concetta rolls two fair dice. This experiment can be decomposed as follows. Let €,
      EXAMPLE 3.33
                              be the experiment where the first die is rolled — with sample space ¥; = {1, 2, 3, 4, 5, 6}.
                              Likewise we let €2 account for the second die rolled—also with sample space So =
                              {1, 2, 3, 4, 5, 6}. (To keep the two dice distinct we can imagine the first die rolled with the
                              left hand and the second with the right. Or we can have the first die colored red and the

More about ordered pairs and the Cartesian product is given in Section 5.1.
                                                                            3.4 A First Word on Probability       153

second green — in order to distinguish them.) Consequently, when Concetta rolls these dice
                the sample space
                 F=f)        xX F2 = {C, 1), C1, 2), 0, 3), 0, 4), (1, 5), Ud, 6), (2, 1), (2, 2), (2, 3), 2,4),

(2, 5), (2, 6), (3, 1), G, 2), (3, 3), G3, 4), 3, 5), 3, 6), 4 1D, (4, 2),
                                        (4, 3), (4, 4), (4, 5), (4, 6), (5, D, (5, 2), (5, 3), (5, 4), (5, 5), 5, 8),
                                        (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
                                    = {(x, y)|x, y = 1, 2, 3, 4, 5, 6}.
               Now consider the following events:

A:      Concetta rolls a 6 (that is, the top faces of the dice sum to 6);
                   B:      The sum of the dice is at least 7;
                  C:       Concetta rolls an even sum; and
                  D:       The sum of the dice is 6 or less.

a) Here
                         i) A= {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} with Pr(A) = |Al/|P| = 5/36;
                        ii) B= {(1, 6), (2, 5), 3, 4), (4, 3), GS, 2), 6, 1), (2, 6), 3.5). (44),
                            (5, 3), (6, 2), (3, 6), (4, 5), (5, 4), (6, 3), (4, 6), (5, 5), (6, 4), (5, 6),
                             (6, 5), (6, 6)} = {(x, y)|x, y = 1, 2, 3,4, 5, 6; x + y > 7} with
                             Pr(B) = |B /|P| = 21/36 = 7/12;
                        iii) C = {(, 1), C1, 3), (2, 2), 3, 1), C1, 5), (2, 4), GB, 3), (4, 2), (5, 1),
                             (2, 6), (3, 5), (4, 4), (5, 3), (6, 2), (4, 6), (5, 5), (6, 4), (6, 6)} with
                             Pr(C) = |C|/|F| = 18/36 = 1/2; and
                        iv) D={(1, 1), C1, 2), (2, 1), (1, 3), (2, 2), (3, 1), C1, 4), (2, 3), (3, 2), (4, 1),
                             (1, 5), 2, 4), (3, 3), (4, 2), (5, 1)} with Pr(D) = |DI/|¥| = 15/36 = 5/12.
                 b) We notice the following:
                         i) AUB     ={(x, y)|x,
                                              y = 1, 2, 3, 4, 5, 6; x + y > 6}, so |A U B| = 26 and
                            Pr(AU B)=|AUBI/|F| = 8 = 3 + 2 = Pr(A) + Pr(B);
                        ii) CUD = {(1, 1), (1, 2), (2, 1), , 3), (2, 2), 3, 1), C1, 4), (2, 3), (3, 2),
                            (4, 1), , 5), 2, 4), G, 3), 4, 2), G, 1D, (2, 6), (3, 5), (4, 4). 5, 3),
                            (6, 2), (4, 6), (5, 5), (6, 4), (6, 6)} so |C U D| = 24 and Pr(C UD) =
                            IC U DI /|P| = 24/36 = 2/3.
               Here, however,

Pr(C U D) = 24/36 F 33/36 = 18/36 + 15/36 = Pr(C) + Pr(D),                          although

Pr(C UD)         = 24/36 = 18/36 + 15/36 — 9/36 = Pr(C)              + Pr(D)     — P(C ND).

The result here and that in part (i) [of (b)] mirror the ideas we saw earlier in the formulas
               following Example 3.25.
                        iii) Finally, Pr(B) = Pr(D) = 15/36 = 1 — 21/36 = 1 — Pr(B).

Let us consider a second example where the Cartesian product is used. This time we’ll
               also learn about another important structure.

An experiment © is conducted as follows: A single die is rolled and its outcome noted, and
EXAMPLE 3.34                    a,        ;                         ;
               then a coin is flipped and its outcome noted. Determine a sample space & for &.
154         Chapter 3 Set Theory

Let ©, denote the first part of experiment ©, and let , = {1, 2, 3, 4, 5, 6} be a sample
                             space for €;. Likewise let 2 = {H, T} be a sample space for €2, the second part of the
                             experiment. Then * = Y| X F2 is a sample space for ©.
                                This sample space can be represented pictorially with a tree diagram that exhibits all
                            the possible outcomes of experiment €. In Fig. 3.14 we have such a tree diagram, which
                            proceeds from left to right. From the left-most endpoint, six branches originate for the six
                            outcomes of the first stage of the experiment €. From each point, numbered 1, 2, ..., 6,
                            two branches indicate the subsequent outcomes for tossing the coin. The 12 ordered pairs
                            at the right endpoints constitute the sample space &.

(2, T)
                                                                                                       (3, H)
                                                                                      —<
                                                                                                       3.1

<—
                                                                                                       “"
                                                                                                       (4, T)

—<
                                                                                                       “

Figure 3.14

Now for this experiment € consider the events
                                   A:   Ahead appears when the coin is tossed.
                                   B:   A3 appears when the die is rolled.

Then A = {(1, H), (2, H), (3, H), (4, H), (5, H), (6, H)} and B = ((3, H), (3, T)}. So
                             Pr(A) =|Al/|¥| = 6/12 = 1/2, Pr(B) = |B|/|¥| = 2/12 = 1/6, and
                                                          7    6    2     ]
                                           P(AUB)      = 75 = 35 + a5 7 Ug          = PIA) + Pr(B) — Pr(A 0 B).

Before we continue let us look back at Examples 3.33 and 3.34. We may not realize
                            it, but we have been making a certain assumption. In Example 3.33 we assumed that the
                            outcome for the first die had no influence on the outcome for the second die. Likewise, in
                            Example 3.34 we assumed that the outcome for the die had no bearing on the outcome for
                            the coin. This concept of independence will be examined more closely in Section 3.6.

In our next example we extend the idea of the Cartesian (or, cross) product to more than
                            two sets.

If Charles tosses a fair coin four times, what is the probability that he gets two heads and
      EXAMPLE 3.35
                            two tails?
                                   Here the sample space for the first toss is ?;   = {H, T}. Likewise, for the second, third, and
                            fourth tosses, we have S23 = £3 = L4 = {H, T}. So, for this experiment of tossing a fair coin
                                                                    3.4 A First Word on Probability     155

four times, we have the sample space ¥ = Y) X Fz X 3        X Fy, where a typical element of
               # is an ordered quadruple. For example, one such ordered quadruple is (H, T, T, T) (which
               may also be denoted HTTT). In this problem |¥| = |P;||P2||F3||Pa| = 27 = 16. The event
               A we are concerned about contains all arrangements of H, H, T, T, so |A| = 4!/(2! 2!) = 6.
               Consequently, Pr(A) = |A]/|P| = 6/16 = 3/8.
                   (Comparable to Examples 3.33 and 3.34, here the result of each toss is independent of
               the outcome of any previous toss.)

The next example also requires some of the formulas developed in Chapter 1 for ar-
               rangements.

The acronym WYSIWYG (for, What you see is what you get!) is used to describe a user-
EXAMPLE 3.36
               interface. This user-interface presents material on a VDT (Video-display terminal) in pre-
               cisely the same format the material appears on hard copy.
                   There are 7!/(2!2!) = 1260 ways in which the letters in the acronym WYSIWYG can
               be arranged. Of these, 120(= 5!) arrangements have both consecutive W’s and consecutive
               Y’s. Consequently, if the letters for this acronym are arranged in a random manner, then we
               find the probability that the arrangement has both consecutive W’s and consecutive Y’s is
               120/1260 = 0.0952.
                   The probability that a random arrangement of these seven letters starts and ends with the
               letter W is [(5!/2!)]/[(7!/(2! 2!))] = 60/1260 = 0.0476.

In our final example we shall use the concept of a Venn diagram.

In a survey of 120 passengers, an airline found that 48 enjoyed wine with their meals,
EXAMPLE 3.37
               78 enjoyed mixed drinks, and 66 enjoyed iced tea. In addition, 36 enjoyed any given pair
               of these beverages and 24 passengers enjoyed them all. If two passengers are selected at
               random from the survey sample of 120, what is the probability that
                 a) (Event A) they both want only iced tea with their meals?
                 b) (Event B) they both enjoy exactly two of the three beverage offerings?
                  From the information provided, we construct the Venn diagram shown in Fig. 3.15. The
               sample space & consists of the pairs of passengers we can select from the sample of 120, so
               \F| = (13°) = 7140. The Venn diagram indicates that there are 18 passengers who drink only
               iced tea, so |A| = ('3) and Pr(A) = 51/2380. The reader should verify that Pr(B) = 3/34.

i.
                                                          (SN)
                                            Figure 3.15
156             Chapter 3 Set Theory

(c) the second smallest number drawn is 5 and the fourth largest
                         19 (3 Gh        AE                                number drawn is 15?

1. The sample space for an experiment is & = {a, b,c,                    11. Darci rolls a fair die three times. What is the probability that
d,e, f, g, h}, where each outcome is equally likely. If event              (a) her second and third rolls are both larger than her first roll?
A = {a, b, c} andevent B = {a, c, e, g}, determine (a) Pr(A);              (b) the result of her second roll is greater than that of her first
(b) Pr(B); (c) Pr(ANB); (d) Pr(AUB);                     (e) Pr(A);        roll and the result of her third roll is greater than the second?
(f) Pr(A U B); and (g) Pr(A NB).                                           12. In selecting a new server for its computing center, a col-
                                                                           lege examines 15 different models, paying attention to the
  2. Joshua draws two ping-pong balls from a bowl of twenty
                                                                           following considerations: (A) cartridge tape drive, (B) DVD
ping-pong balls numbered | to 20. Provide a sample space for
                                                                           Burner, and (C) SCSI RAID Array (a type of failure-tolerant
this experiment if
                                                                           disk-storage device). The numbers of servers with any or all of
      a) the first ball drawn is replaced before the second ball is        these features are as follows: |A| = |B| = |C| = 6, |AN B| =
      drawn.                                                                IBONC|=1, |ANC| = 2, |ANBNC|                     = 0. (a) How many
                                                                                                                                             of
      b) the first ball drawn is not replaced before the second ball       the models have exactly one of the features being considered?
      is drawn.                                                            (b) How many have none of the features? (c) If a model is se-
3. Asample space & (for an experiment
                                     €) contains 25 equally                lected at random, what is the probability that it has exactly two
likely outcomes. If an event A (for this experiment @) is such             of these features?
that Pr(A) = 0.24, how many outcomes are there in A?                       13. At the Gamma Kappa Phi sorority the 15 sisters who are se-
  4. Asample space & (for an experiment €) contains n equally              niors line up in a random manner for a graduation picture. Two
likely outcomes. [f an event A (for this experiment @) contains            of these sisters are Columba and Piret. What is the probability
7 of these outcomes and Pr(A) = 0.14, what is n?
                                                                           that this graduation picture will find (a) Piret at the center po-
                                                                           sition in the line? (b) Piret and Columba standing next to each
  5. The Tuesday night dance club is made up of six married                other? (c) exactly five sisters standing between Columba and
couples and two of these twelve members must be chosen to                  Piret?
find a dance hall for an upcoming fund raiser. (a) If the two
                                                                            14, The freshman class of a private engineering college has
members are selected at random, what is the probability they
                                                                            300 students. It is known that 180 can program in Java, 120 in
are both women? (b) If Joan and Douglas are one of the couples
                                                                           Visual BASIC’, 30 in C+4, 12 in Java and C+4, 18 in Visual
in the club, what is the probability at least one of them is among
                                                                           BASIC and C++, 12 in Java and Visual BASIC, and 6 in all
the two who are chosen?
                                                                           three languages.
6. If two integers are selected, at random and without replace-
                                                                                  a) A student is selected at random. What is the probability
ment, from     {1, 2, 3,..., 99, 100}, what is the probability the
                                                                                  that she can program in exactly two languages?
integers are consecutive?
                                                                                  b) Two students are selected at random. What is the prob-
7. Two integers are selected, at random and without replace-                     ability that they can (i) both program in Java? (ii) both
ment, from {1, 2, 3,..., 99, 100}. What is the probability their                  program only in Java?
sum is even?
                                                                            15. An integer is selected at random from 3 through 17 inclu-
  8. If three integers are selected, at random and without re-              sive. If A is the event that a number divisible by 3 is chosen
placement, from {1, 2, 3,..., 99, 100}, what is the probability             and   B   is the   event    that the   number   exceeds   10,   determine
their sum is even?                                                          Pr(A),    Pr(B),     Pr(A    QB), and Pr(A U B). How is
9. Jerry tosses a fair coin six times. What is the probability             Pr(A UB) related to Pr(A), Pr(B), and Pr(ANM B)?
he gets (a) all heads; (b) one head; (c) two heads; (d) an even             16. a) If the letters in the acronym WYSIWYG are arranged in
number of heads; and (e) at least four heads?                                   a random manner, what is the probability the arrangement
10. Twenty-five slips of paper, numbered 1, 2, 3, ..., 25, are                  starts and ends with the same letter?
placed in a box. If Amy draws six of these slips, without re-                     b) What is the probability that a randomly generated ar-
placement, what is the probability that (a) the second smallest                   rangement of the letters in WYSIWYG has no pair of con-
number drawn is 5? (b) the fourth largest number drawn is 15?                     secutive identical letters?

Visual BASIC is a trademark of the Microsoft Corporation.
                                                                                3.5 The Axioms of Probability (Optional)              157

3.5
The Axioms of Probability (Optional)
                   In Section 3.4 our typical experiment had a sample space where each outcome had the same
                   likelihood, or probability, of occurrence.                If this does not happen, what do we do? Let us
                   start by considering the following examples.

Suppose Trudy tosses a single coin but it is not fair — for instance, suppose this coin is loaded
EXAMPLE 3.38
                  to come up heads twice as often as it comes up tails. Here the sample space ¥ = {H, T},
                   as in Example 3.28, but unlike that example where Pr(H)' = Pr(T), in this situation
                   we have Pr(H) # Pr(T). With H, T as the only outcomes, we have | = Pr(Y) =
                   Pr({H} U {T}) = Pr(H) + Pr(T).      Since  Pr(H)=2Pr(T),   it follows    that   1=
                   Pr(H) + Pr(T) = 2Pr(T) + Pr(f), so Pr(T) = 1/3 and Pr(H) = 2/3.

A warehouse               contains   10 motors, three of which are defective (D). The other seven are
EXAMPLE 3.39
                  in good (G) working condition. A first inspector enters the warehouse and selects (and
                  inspects) one of the motors. For this experiment 6, we have the sample space                              ; = {D, G}
                  where Pr(D) = 3/10 and Pr(G) = 7/10. The next day a second inspector enters this same
                  warehouse and selects (and inspects) a motor. For this second experiment — call it 62 — we
                  likewise have Sf = {D, G}. But how do we define Pr(D), Pr(G) in this case? The answer
                  depends on whether the first motor selected remained in the warehouse, or was removed.

g       2 3.2
                                                         10  9
                                                               =(3)/(19)-
                                                                  2   2
                                                                          690                               =      D3a.3
                                                                                                                     10 10
                                                                                                                           9100
                                                         3.7 =(3(7)/(19) =21                                          37 20
                                                         10   9     V\1     2      90                                  10    10      100

4-2 =(NA-                 3                                  7 3         21
                                                         10   9     1/1     2      90                                  10    10      100

°                58   =(MO)-#                                        5        7.7 Ad
                                               9         10 9   2 2   90                                   10          10    10      100

(a)                    Without Replacement                        (b)          With Replacement

Figure 3.16

The tree diagrams in Fig. 3.16 deal with the two possibilities. For part (a) of the figure
                  consider, for example, the case where the first motor selected is defective (D), with prob-
                  ability 3/10, and then the second motor selected is also defective (D). Since motors are
                  not replaced here, when selecting the second motor the inspector is dealing with nine mo-
                  tors — two defective (D) and seven in good (G) working condition. Hence the probability
                  of selecting a defective motor here is 2/9, not 3/10. So this situation, as shown by the top
                  branching, has probability 7 - ¢ = (3)/(12) = & = 4. The comparable case in part (b) of
                  the figure has probability s . 310 ~~ 100 7

* Recall that when an event consists of a single outcome — say a, we may abbreviate Pr({a}) as Pr(a).
158   Chapter 3 Set Theory

When selecting two motors, either with or without replacement, the sample space is
                           = {DD, DG, GD, GG} where, for instance, DG is used to abbreviate (D, G). Yet in neither
                       situation do the outcomes have the same likelihood of occurrence. If the selections are done
                       without replacement [as in Fig. 3. 16(a)], then Pr(DD) = z. Pr(DG)= ot , Pr(GD) =
                       a Pr(GG) = =, with &x +2 5 +2 a ++ =1= P(X). When the first motor is replaced
                       [as in Fig. 3 16(0), we have rb) = im Pr(DG) = a Pr(GD) = fae Pr(GG)=
                         9                                  _
                       joo» With 735 + i995 + ins + io= 1 = Pr).
                         From this point on we’ll deal exclusively with the case where the two selections are
                      made without replacement. Consider the following events:

A:     One (that is exactly one) motor is defective: {DG, GD};
                             B:     At least one motor is defective: {DG, GD, DD};
                             C:     Both motors are defective: {DD};
                             E:     Both motors are in good working condition: {GG}.
                             Here

prayer get                         pg
                                                                             a2 2h, 6 8
                                                   90    90      15                       90   90      90     15
                                         Pr(C) = 6       i                  Pr(E) =            a
                                                 90      15                               90   15
                      Further, (i)B = E and Pr(B)= Pr(E)=4                      =1    -             Pr(B); and (ii) AUC   = B
                                                                          Be         5+ 1 = Pr(A) + Pr(C).
                                                                        AW gh

with AMC          = 9%, so Pr(A UC)     = Pr(B)

What we did in the latter part of Example 3.39 now motivates our next observation. This
                      observation extends our earlier results in Section 3.4 where each outcome of the sample
                      space had the same likelihood, or probability, of occurring.
                         Let & be the sample space for an experiment 6. Each element a € & is called an outcome,
                      or elementary event, and we let Pr({a})= Pr(a) denote the probability that this outcome
                      occurs. Each nonempty subset A of & is still called an event. If event A = {a,, a2..., an},
                      where a; is an outcome, forall 1 <i <n, then Pr(A) =       re Pr(a;). (Note: When A =
                      we assign Pr(A) = 0, a result we shall actually establish later in this section.)
                             However, before we get to our axioms of probability, there is a point that needs to be
                      clarified. We know that when a fair die is rolled, the sample space F = {1, 2, 3, 4, 5, 6},
                      where each outcome has the same likelihood, or probability, of occurrence — namely, 1/6.
                      However, if this die is rolled six times we should not expect to see one occurrence of each
                      of the possible outcomes 1, 2, ... , 6. Should this die be rolled 60 times we want each roll
                      (after the first) to be unaffected by any previous roll — that is, each roll (after the first) is
                      to be independent of any previous roll. Further, we cannot expect each of the six possible
                      outcomes to occur ten times. In fact, if the 1 comes up 20 times and this die is then rolled
                      60 more times we cannot expect to see | come up 20 times again. So what can we expect?
                      If, in rolling this fair die n times, the outcome of | occurs m times, then as n grows larger
                      we expect the relative frequency m/n to approach | /6.
                          So far this discussion has dealt with a sample space where each outcome has the same
                      likelihood, or probability, of occurrence. However, the idea is still appropriate if we consider
                      any sample space — for example, the sample space of Example 3.38. Equally important is
                      how one can use the idea of relative frequency in modeling an experiment. For suppose we
                      have a coin that we believe to be biased— perhaps because it is heavier than other similar
                                                                        3.5 The Axioms of Probability (Optional)             159

coins that we have weighed. In tossing this coin the sample space is S = {H, T}, but how
                  can we determine Pr(H), Pr(T)? We might toss the coin n times, assuming the outcome
                  of each toss (after the first) is not affected by any previous outcome. If H comes up m
                  times, then we can assign Pr(H)          = m/n and Pr(T)        = (n — m)/n       = 1 — (m/n), where the
                  accuracy of these assigned probabilities improves as n grows larger.

Having addressed the issue of probabilities as relative frequencies, now it is time to
                  focus on the topic of this section — namely, the axioms of probability. One should find these
                  axioms rather intuitive, especially when we look back at some of the results in Example
                  3.29 and part (b) of Example 3.33. The axioms were first introduced in 1933 by Andrei
                  Kolmogorov and they apply to the case when the sample space & is finite.

The Axioms of Probability
                    Let ¥ be the sample space for an experiment %. ff A, B are any events
                                                                                       — that is,
                    0 A, B C¥ (so we now allow the empty set to be an évent), then
                          1} Pr(A)>0
                          2) Pr(f) =1
                          3) if A, B are disjoint (or, mutually disjoint) then Pr(A U B) = Pr(A) + Pr(B).'

Using these axioms we shall now establish a number of applicable results.

THEOREM 3.7       The Rule of Complement. Let £ be the sample space for an experiment ©. If A is an event
                  (that is, A C F), then

Pr(A) = 1 — Pr(A).
                  Proof: We know that # =AU A with AN A= %. So from axioms (2) and (3) it follows that
                  1 = Pr(¥Y) = Pr(A UA) = Pr(A) + Pr(A), and Pr(A) = 1 — Pr{A).

Note that when A = @ in Theorem 3.7 we have 1 = Pr(¥)                      = Pr(A)       = 1—   Pr(A), so
                  Pr(@) = Pr(A) = 0, in agreement with our earlier assignment.
                     The result of Theorem 3.7 can help cut down on our calculations in solving certain
                  probability problems. This is demonstrated in the next two examples.

Suppose the letters in the word PROBABILITY are arranged in a random manner. Deter-
   EXAMPLE 3.40
                  mine Pr(A) for the event

A:     The arrangement begins with one letter and ends in a different letter.

‘Although our major concern in this chapter (if not the entire text) deals with F finite, when Y is infinite
                  Kolmogorov provided the fourth axiom:
                     4) if A), Az, Az, ... are events (taken from *) and A, 1 A, = W for all 1 <i < j, then

Pr (U 4) = » Pr(An).
160         Chapter 3 Set Theory

We consider four cases:

1) Start with the situation where neither B nor I appears at the start or finish. There
                                      are seven remaining (distinct) letters. Any one of them can be used at the start of
                                      the arrangement and there are six choices then for the last letter. For the nine letters
                                      in between there are xm       arrangements.   So for this case there are (7) (525) (6) =
                                      3,810,240 possibilities.
                                   2) Now suppose that B is used as the first or last letter (but not in both positions) andI
                                      only appears among the nine letters in the center. With one B so placed there are seven
                                      other (distinct) letters that can be used at the opposite end of the arrangement. The
                                        .        +             :                             .   |            .
                                      nine remaining letters in between can be arranged in z ways, So this case accounts
                                      for (2)(7)3 = 2,540,160 arrangements.
                                   3) If we use one of the I’s and none of the B’s to start or end an arrangement, then there
                                      are again 2,540,160 arrangements, as we had in case (2).
                                   4) Finally, if one of B, I 1s used at the start and the other letter at the end, we can arrange
                                      the remaining nine letters in between in 9! ways. So here we have the final 2(9!) =
                                      725,760 arrangements.

Here |F| = 35; = 9,979,200, so Pr(A) = 28.58 = 3.
                                This result took quite a lot of calculations. So instead of the event A let us consider the
                             event A —that is, the event where the arrangement begins and ends with the same letter.
                             How many such arrangements are there? Say we use the letter B at the start and finish of the
                             arrangement. Then the other nine letters in between can be arranged in 3 ways. If I is used
                             in place of B another 2 arrangements result. So |A| = 9! and Pr(A) = aaa                      =Z.
                                   With much less effort Theorem 3.7 shows us that Pr(A) = 1 — Pr(A) = S,

Due to an intense preseason workout schedule, Coach Davis has honed her volleyball
      EXAMPLE 3.41
                             team into a major contender. Consequently, the probability her team will win any given
                             tournament is 0.7, regardless of any previous win or loss. Suppose the team is slated to play
                             eight tournaments.
                               a) The probability the women will win all eight tournaments is (0.7)° = 0.057648. Could
                                  they possibly lose all eight tournaments? Yes, with probability (0.3)° = 0.000066.
                               b) What is the probability the team wins exactly five of the eight tournaments? One way
                                     this can happen is if the team wins the first and second tournaments, loses the next three,
                                     and then wins the last three. We represent this by WWLLLWWW.             The probability for
                                     this outcome is (0.7)*(0.3)? (0.7)? = (0.7)°(0.3)*. Another possibility that results in
                                     five tournament wins can be represented by WWLLWWLW.                The probability here is
                                     (0.7)7(0.3)7(0.7)7(0.3)
                                                         (0.7) = (0.7)° (0.3). At this point we see that the probability
                                     Coach Davis’s team wins five of the eight tournaments is

(The number of arrangements of five W’s and three L’s) X (0.7)°(0.3)°.

From the material in Sections 1.2 and 1.3, especially Example 1.22, we know that there
                                     are <* = (2) ways to arrange five W’s and three L’s. Consequently, the probability
                                     the team wins five tournaments is

(SJonos)           = (0.254122.
                                                  3.5 The Axioms of Probability (Optional)   161

c) Finally, what is the probability the team wins at least one tournament? Let us not do
          here what we did in Example 3.40. If we let A be the given event, then Pr(A) =
              *_, (8)(0.7)'(0.3)8". But Pr(A) is more readily determined as 1 — Pr(A), where
            Pr(A) = the probability the team loses all eight tournaments = (0.3)® = 0.000066
            [as in part (a)]. Consequently, Pr(A) = 1 — (0.3)® = 0.999934.

Before we go on we want to examine the structure of the answer at the end of part (b)
of Example 3.41. Each tournament in the example results in either a win (success) or
loss (failure). Further, after the first tournament, the outcome of each later tournament is
independent of the outcome of any previous tournament. Such a two-outcome occurrence
is called a Bernoulli trial. If there are n such trials and each trial has probability p of
success and probability g (= 1 — p) of failure, then the probability that there are (exactly)
k successes among these x trials is

(Feta            O<k<n.

(We shall come upon this idea again in Section           16.5 when we study the application of
Abelian groups in coding theory.)

Returning now to the axioms of probability, we know from axiom (3) that, for A, B CY,
if AM B = @then Pr(A U B) = Pr(A) + Pr(B). But what can we say if AN B 4 6?

f

Figure 3.17

For the Venn diagram in Fig. 3.17 the interior of the rectangle represents the universe —
here the sample space ¥. The shaded region in the diagram denotes the event A — B =
A B. Further,

i)   the events A   B and B are disjoint, since (AN B)N       B=    AN   (BO    B)=ANK=
            ; and
       ii) (AN B)UB=(AUB)N(BUB)=(AUB)NY=AUB.
From these two observations and axiom (3) it follows that

(*)                  Pr(AUB) = Pr((AN          B) UB) = Pr(AN
                                                          B) + Pr(B).
Next note that A= ANS =AN(BUB)=(ANB)U(AMB)                                        where (AN B)N
(AN B) =(ANA)N
          (BN B) = ANG = G. So once again axiom (3) gives us

Pr(A) = Pr(AN B)+ Pr(ANB),     or
(**)                           Pr(AN B) = Pr(A) — Pr(AN B).
The results in Eqs. (*) and (**) now establish the following.
162         Chapter 3 Set Theory

THEOREM 3.8                  The Additive Rule. If F is the sample space for an experiment ©, and A, B C Y, then

Pr(A UB) = Pr(AN B) + P(B) = Pr(A) + Pr(B) — Pr(AN B).

At this point we use the result in Theorem 3.8 in the following two examples.

Yosi selects a card from a well-shuffled standard deck. What is the probability his card is a
      EXAMPLE 3.42
                             club or a card whose face value is between 3 and 7 inclusive?
                                 Start by defining the events A, B as follows:
                                   A:    The card drawn is a club.
                                   B:    The face value of the card drawn is between 3 and 7 inclusive.

The answer to the problem is Pr(A         U B).
                                 Here Pr(A) = 13/52 and Pr(B) = 20/52. Also Pr(A NM B) = 5/52 —for                       the 3 of
                             clubs, 4 of clubs, ..., and 7 of clubs. Consequently, by Theorem 3.8, we have

r(AU UB) B) = Pr(A)+
                                        Pr(A
                                                                                               13         20   5   228   7
                                                            Pr(A) + Pr(B)
                                                                    Pr(B) -— PV(ANB)=m+5-S
                                                                              PAN B= 5 + 5-5 == 5B =:
                                                                                                   = 73

Diane inspects 120 cast aluminum rods and classifies the diameter and surface finish of
      EXAMPLE 3.43
                             each rod as adequate or superior. Her findings are summarized in Table 3.4.

Table 3.4
                                                                                          Diameter

adequate       superior

Surface | adequate         10              18
                                                          Finish    superior         12              80

Define the events A, B as follows:

A:    The diameter of the rod is classified as superior.
                                   B:    The surface finish of the rod is classified as superior.
                             Then

Pr(A) = (18 + 80)/120 = 98/120 = 49/60 = 0.816667
                                   Pr(B) = (12 + 80)/120 = 92/120 = 23/30 = 0.766667
                                   Pr(AN B) = 80/120 = 2/3 = 0.666667.
                             By Theorem 3.8
                                                Pr(AU B) = Pr(A) + Pr(B) — Pr(AN B)
                                                               98         2            11
                                                             = 8
                                                               120      120
                                                                               0       OL
                                                                               12006©6120)=6«612
                                                                                                 0.916667.
                                   So 110 [= 110.40 = (0.92)(120)] of these 120 rods have either a superior diameter or a
                             superior surface finish, or perhaps both.
                                                                          3.5 The Axioms of Probability (Optional)           163

In addition,

Pr(A) = the probability the diameter of the rod is classified as adequate = W242)
                                                                                                  120                           =
                     22 =_ | - fp98 = 1 Pr(A),
                    po                          and
                    Pr(B) = the probability the surface finish of the rod is classified as adequate =                 es        =
                     120           120                   ,
                 Using DeMorgan’s Laws we also find that Pr(A U B) = Pr(AN B) = 1— Pr(AN B)=
                 1 —$=4,and Pr(ANB) = Pr(AUB)=1- Pr(AUB)=1-He=                                                 3.

Now we want to extend the result of Theorem 3.8 to more than two events. The following
                 theorem deals with three events and suggests the pattern for four or more.

THEOREM 3.9      Let # be the sample space for an experiment @. For events A, B, C CY,
                                              Pr(AUBUC)=
                 Pr(A) + Pr(B) + Pr(C) — Pr(AN B) — Pr(ANC) — Pr(BNC) + Pr(ANBNC).
                 Proof: The Laws of Set Theory from Section 3.2 validate what follows:

Pr(AUBUC)              = Pr(AUB)UC)             = Pr(AU B)4+            Pr(C) — Pr((AU B)NC)
                                         = Pr(A)+ Pr(B) — Pr(AN B)+ Pr(C) — Pr(ANC)U(BNC))
                                         = Pr(A)+ Pr(B) + Pr(C) — Pr{An B)
                                           ~[Pr(ANC)+             Pr(BNC)—           Pr(ANC)N(BNC))]
                                         = Pr(A) + Pr(B) + Pr(C) — Pr(AN B)
                                           —~ Pr(ANC)—           Pr(BNC)+           PrAN      BNC).

Note that the last equality follows because (AN C)N(BNC)=ANBNC                                    by      the
                 Associative,   Commutative,       and       Idempotent     Laws    of Intersection.    Also   note   the   simi-
                 larity between the formula for Pr(A U B UC) and that for |A U B U C| (given prior to
                 Example 3.27).
                     Further, we see that the formula for Pr(A U B UC) involves 7 (= 2? — 1) summands.
                 For four events we would have 15 (= 2* — 1) summands: (i) 4 = (;) summands
                                                                                         — one
                 for each single event; (ii) 6 = (5)          summands
                                                                   — one             for each pair of events; (iii) 4 = (3)
                 summands
                       — one             for each triple of events; and (iv) 1 = (3) summand for all four of
                 the events. When        dealing with # events,      A;, A2,...,        A,, where n > 2, the formula for
                 Pr(A; UA, U--+UA,) has a total of )°"_, (‘) =                        a     (7) — (5) = 2" — 1 summands,
                 by Corollary 1.1. For 1 <r <n, there are (7) summands — one for each way we can select
                 r of the n events. Each of these summands is preceded by a plus sign, for r odd, or a minus
                 sign, for r even.

We’ ll see more formulas like the one in Theorem 3.9 in Section 8.1. For now let us apply
                 the result of this theorem in the following example.

The game of Roulette is played by initially spinning a small white ball on a circular wheel that
  EXAMPLE 3.44
                 is divided into 38 sections of equal area. These sections are labeled 00, 0,1, 2, 3,..., 36.
164          Chapter 3 Set Theory

As the wheel slows down,             the number of the section where the ball comes                         to rest is the
                              outcome for that one play of the game.
                                    The numbers on the wheel are colored as follows.
                                        Green:      00       O
                                        Red:          1      3     5     7              9   12       14         16   18
                                                    19     21    23    25             27    30       32         34   36
                                        Black:       2      4      6     8            10    11       #13        15   #217
                                                    20     22    24    26             28    29       31     33       35
                              A player may place bets in various ways, such as (1) odd, even (here 00 and 0 are considered
                              neither even nor odd); (11) low (1-18), high (19-36); or (ili) red, black.
                                 Gary enjoys Roulette and decides to place bets according to the events.

A:      The outcome is low.             B:          The outcome is red.                 C:   The outcome is odd.

What is the probability Gary wins at least one of his bets — that is, whatis Pr(A U BUC)?
                                    Here Pr(A) = Pr(B) = Pr(C) = 18/38, Pr(AN B) = Pr(ANC) = 9/38,
                              Pr(B NC) = 10/38, Pr(AN BNC) = 5/38, and by Theorem 3.9
                                                  18  18  18    9     9    10    #5   31
                                   Pr(AUBUC)=—4+—-4+=>-—5-=—-=                  —=— =(081                                                         ;
                                          r(              ©) = 3g + 3g + 3g — 3g                           38        38 7 38      3g    8178?

In closing this section we need to make one more point. The examples we’ ve seen here
                              and in the previous section have all dealt with finite sample spaces. Yet it is possible to have
                              situations where a sample space is infinite. For instance, suppose a man takes a driver’s test
                              until he passes it. If he passes the test on his first try, we write P for this outcome. Should
                              he need three attempts to pass the test, then we write FFP to denote the first and second
                              failures followed by his passing of the test. Hence the sample space may be given here as
                              ff = {P, FP, FFP, FFFP... .}, an example of a countably infinite’ set.
                                  When dealing with sample spaces that are finite or countably infinite, we call the sample
                              space discrete. The coverage here in Chapter 3 deals strictly with discrete sample spaces
                              that are finite. However, in Section 9.2, we’ll consider an example where the sample space
                              is countably infinite.
                                  Finally, suppose an experiment calls for a technician to record the temperature, in degrees
                              Fahrenheit, of a heated iron rod. Theoretically, the sample space here could comprise an
                              open interval of real numbers    — for instance, f = {t|180°F < t < 190°F}. Here the sample
                              space is again infinite, but this time it is uncountably* infinite. In this case the sample space
                              is called continuous and now one needs calculus to solve the related probability problems.
                              We will not pursue this here but will direct the interested reader to the chapter references —
                              especially, the text by J. J. Kinney [7].

Pr(AUB), Pr(AUB), Pr(ANB), Pr(AN B),
                       EXERCISES 3.5                                         Pr(AUB), and Pr(A UB).
1. Let & be the sample space for an experiment 6 and                            2. Ashley tosses
                                                                                              a fair coin eight times. What is the probability
let A, B be events   from   Y, where        Pr(A) = 0.4, Pr(B) =             she gets (a) six heads; (b) at least six heads; (c) two heads; and
0.3,   and   Pr(ANB)=0.2.           Determine      Pr(A), Pr(B), — (d) at most two heads?

“The interested reader can find more on countable sets in Appendix 3.
                                    +                                             .              .
                                    *More on uncountable sets can be found in Appendix 3.
                                                                                    3.5 The Axioms of Probability (Optional)            165

3. Ten ping-pong balls labeled | to 10 are placed in a box.            Let A, B denote the events
Two of these balls are then drawn, in succession and without
                                                                           A:    The sample has foam type 1.
replacement, from the box.
                                                                           B:    The sample meets specifications.
    a) Find the sample space for this experiment.
    b) Find the probability that the label on the second ball           Determine Pr(A), Pr(B), Pr(AMB),                Pr(AUB),     Pr(A),
    drawn is smaller than the label on the first.                       Pr(B), Pr(AU B), Pr(AN B), Pr(A                A B).
    c) Find the probability that the label on one ball is even          11. Consider the game of Roulette as described in Example
    while the label on the other is odd.                                3.44,
4. Russell draws one card from a standard deck. If A, B, C                 a) If the game is played once, what is the probability the
denote the events                                                           outcome is (i) high or odd; (i1) low or black?
                                                                            b) If the game is played twice, what is the probability
   A:    The card is a spade.
                                                                            (i) both outcomes are black; (ii) one outcome is red and
   B:    The card is red.                                                   the other green?
   C:    Thecard is a picture card (that is, ajack, queen, or king).    12. Let & be the sample space for an experiment @ and
Find Pr(AU BUC).                                                        let A,BCY. If Pr(A) = Pr(B), Pr(AN B) = 1/5, and
                                                                        Pr(A U B) = 1/5, determine Pr(A U B), Pr(A), Pr(A — B),
5. Let £ be the sample space for an experiment @. If A, B are
                                                                        Pr(A A B).
disjoint events from F with Pr(A) = 0.3 and Pr(A UB) =
0.7, what is Pr(B)?                                                     13. The following data give the age and gender of 14 science
                                                                        professors at a small junior college.
6. If F is the sample space for an experiment and A, B CY,
how is Pr(A A B)relatedto Pr(A), Pr(B), and Pr(AM B)?                   25M      39 F     27 F     53M          36 F     37F   30M
[Note: Pr(A A B) 1s the probability that exactly one of the             29F      32M      31M      38 F         26M      24F   40F
events A, B occurs.]
                                                                        One professor will be chosen at random to represent the fac-
7. Adie is loaded so that the probability a given number turns
                                                                        ulty on the board of trustees. What is the probability that the
up is proportional to that number.      So, for example, the out-
                                                                        professor chosen is a man or over 35?
come 4 is twice as likely as the outcome 2, and the outcome 3
is three times as likely as that of 1. If this die is rolled, what is   14, The nine members of a coed intramural volleyball team are
the probability the outcome is (a) 5 or 6; (b) even; (c) odd?           to be randomly selected from nine college men and ten college
                                                                        women. To be classified as coed the team must include at least
  8. Suppose we have two dice     — each loaded as described in
                                                                        one player of each gender. What is the probability the selected
the previous exercise. If these dice are rolled, what is the prob-
                                                                        team includes more women than men?
ability the outcome is (a) 10; (b) at least 10; (c) a double?
                                                                        15, While traveling through Pennsylvania, Ann decides to buy
  9. Juan tosses a fair coin five times. What is the probability
                                                                        a lottery ticket for which she selects seven integers from 1 to
the number of heads always exceeds the number of tails as each
                                                                        80 inclusive. The state lottery commission then selects 11 of
outcome is observed?                                                    these 80 integers. If Ann’s selection matches seven of these 1 |
10. Three types of foam are tested to see if they meet specifi-         integers she is a winner, What is the probability Ann is a winner?
cations. Table 3.5 summarizes the results for the 125 samples
                                                                        16. Let S be the sample space for an experiment © and let
tested.
                                                                        A, B CF with A C B. Prove that Pr(A) < Pr(B).
            Table 3.5                                                   17. Let F   be the sample     space     for an experiment    6, and
                                                                        let A, BCY. If Pr(A) =0.7 and Pr(B) = 0.5, prove that
                            Specifications Are Met
                                                                        Pr(AN B) > 0.2.
                                No           Yes

Foam | |            5            60
            Type | 2            7            30
                        3        8           15
166         Chapter 3. Set Theory

3.6
  Conditional Probability: Independence
                (Optional)
                             Throughout Sections 3.4 and 3.5 especially
                                                                —            prior to and at the end of Example 3.35, as
                             well as in and after Example 3.41 — we mentioned the idea of the independence of outcomes.
                             There we questioned whether the occurrence of a certain outcome might somehow affect the
                             occurrence of another outcome. In this section we extend this idea from a single outcome to
                             an event and make it more mathematically precise. To do so we proceed with the following.

Vincent rolls a pair of fair dice. The sample space & for this experiment is shown in Fig. 3.18,
      EXAMPLE 3.45
                             along with the events

A:   The sum (on the faces) is at least 9.
                                    B:   Adouble is rolled.

(1,5),    (1, 6),

(2,5),    (2, 6),
                                                                                                         aN
                                                                   (3, 2),                 G,9),-(3, 8) tp
                                                                                                    “i             i

(4,1),   (4, 2),                                         !
                                                                                                                   j
                                                                                                                   !
                                                          (5,1),   (5, 2),                                     |
                                                                                                                   1
                                                                             7                                     1
                                                          (6,1),   (6,2).776,3),   6,4),
                                                   Figure 3.18

We see that Pr(A)= a ==2, Pr(B)=                od and Pr(AM B) = Pr(BNA)= = = =
                                  But now, instead of just asking about ththe     ene of the occurrence of event B, we
                              go one step further. Here we want to determine the probability of the occurrence of event
                              B given the condition that event A has occurred. This conditional probability is denoted by
                              Pr(B|A) and may be determined as follows.
                                  The occurrence of event A reduces the sample space from the 36 equally-likely ordered
                              pairs in ¥ to the 10 equally-likely ordered pairs in A. Among the ordered pairs in A,
                              two are also doubles —namely,         (5,5) and (6, 6). Consequently, the probability of B given
                              A = Pr(B\A)= 2,
                                            io
                                               and we notice that 2 = G40
                                                                      (10/36) = ao

Before we suggest the result at the end of Example 3.45 as a general formula, let us
                              consider a second example — one where the outcomes are not equally likely.

Lindsay has a coin that is biased with Pr(H)= <; and Pr(T)= 5. She tosses this coin
      EXAMPLE 3.46
                              three times, where the result of each toss is independent of any ceed   result. The eight
                              possible outcomes in the sample space have the following probabilities:
                                                3.6 Conditional Probability: Independence (Optional)          167

Pr(HHH) = (2) = &
                                                              3              2

Pr(HTT) = Pr(THT) = Pr(TTH) = (2) (3)                                =F
   Pr(tTt) = (1) = 4.
[Note that the sum of these probabilities is $ + 3 (4) +3 (4)7 +3                           = 45%
                                                                                                27          = 1)
   Consider the events
   A:      The first toss results in a head [so A = {HTT, HTH, HHT, HHH}                         and
   Pr(A)=3+2(F)+5                          = $3].
   B:      The number of heads is even [so B = {TTT, HHT, HTH, THH}                             and
   Pr(B) = 3 4+3(s)=8].
Furthermore, A 9 B = {HTH, HHT} and Pr(B 1                              A) = Pr(ANB)=3+4               =.
   To determine the conditional probability of B given A —thatis, Pr(B|A) — we'll make
A our new sample space and redefine the probability of the four outcomes in A as follows:
     7        — Pr(HTT)— (2/27) _ 1               !       = Pr(HT   — 4/27)
                                                                         H)   _ 2
   Pr(HYT) = “Pr(Ay)—s(18/2)—s«*O@S            Pr'(HTH) = Pr(A)       (18/27)     9
      f        _ Pr(HHT) _ (4/27) _ 2             ,       _ Pr(HHH) _ (8/27) _. 4
   Pr(HHT) = ray = san = 5                                        Pr’ (HHA) =      ra        8728
(We see that Pr’(HTT) + Pr’(HTH) + Pr’'(HHT) + Pr’(HHH) = 5 +24+5+2=1)
Among the four outcomes in A, two of them satisfy the condition given in event B —
namely, HTH        and HHT, the outcomes in BM                    A. Consequently,    Pr(B|A)    = Pr’(HTH)     +
Pr'(HHT)
  = § + 5 = 9 = ig = isan = Pray
   t          —_   2      242         8     _   8/27   _   Pr( BNA)

Motivated by the final result in each of the last two examples, we now summarize the
underlying general procedure. We want a formula for Pr(B|A), the conditional probability
of the occurrence of event B given the occurrence of event A. Further, this formula should
help us avoid unnecessary calculations such as those in Example 3.46, where we recalculated
the probability of each outcome in A.
    Now once we know that the event A has occurred, the sample space ¥ shrinks to the
outcomes in A. If we divide the probability of each outcome in A by Pr(A), as in Example
3.46, the sum          of these new       probabilities    sums       to 1, so A can serve as the new   sample
space. Further, suppose         ¢), e2 are two outcomes in Y                 with Pr(erz) = kPr(e,), where k is
a constant. If e;, €2 € A, then within the new sample space A the probability of e; is still k
times that of e>.
   To calculate Pr(B|A) we now consider those outcomes in event A that are in event B.
This gives us the outcomes in event BM A and leads us to the following.

if ¥ is the sample space for an experiment 6 and A, BC Ff, then
                                                                                         Pr(BNA)
             the conditional probability of B given A = Pr(B\A) = —————,
                                                                                           Pr(A)
  so long as Pr(A) # 0.

Further,

Pr(      BM A) = Pr{(AN B) = Pr(A)Pr(B\A),
168         Chapter 3 Set Theory

and upon changing the roles of A and B we have
                                                         Pr(AfN B) = Pr(BO A) = Pr(B)Pr(A|B).

The result
                                                        Pr(A)Pr(B\|A) = Pr(AN B) = Pr(B)Pr(A|B)
                             is often called the multiplicative rule.
                                 Without realizing it, we actually used the multiplicative rule in Example 3.39 — in the
                             case where the motors were not replaced after inspection. The first part of our next example
                             now reinforces how we use this rule.

A cooler contains seven cans of cola and three cans of root beer. Without looking at the
      EXAMPLE 3.47
                             contents, Gustavo reaches in and withdraws one can for his friend Jody. Then he reaches in
                             again to get a can for himself.
                                   Let A, B denote the events

A:     The first selection is a can of cola.
                                   B:     The second selection is a can of cola.

a) Using the multiplicative rule, the probability that Gustavo chooses two cans of cola is
                                                                                             7         6      7
                                                        Pr(AQ B) = Pr(A)Pr(B\A) = (35)            @        = 75:
                                        [Here Pr(B|A)    = 6/9 because after the first can of cola is removed, the cooler then
                                        contains six cans of cola and three of root beer.]
                               b) The multiplicative rule and the additive rule (of Theorem 3.8) tell us that the probability
                                        Gustavo selects two cans of cola or two cans of root beer is
                                                   Pr(AN B) + Pr(AN B) = P(A)P(B\A) + Pr(A)Pr(B\A)

= (10) (5) = (ao) (5) “a5
                               c) Finally, let us determine Pr(B). To do so we develop a new formula with the help of
                                        the Venn diagram (for a sample space F and events A, B) in Fig. 3.19. From the fig-
                                        ure (and the laws of set theory) we see that B = BM  LS=BN(AUA)=(BNA)U
                                        (BO A), where (BN AYN(BNA)=BN(ANA)D=BNG=EB.
                                                          Pr(B) = Pr(BQNA)+ Pr(BN A)

(5)(6)-(3)Q)-8"i
                                                                  = Pr(A)Pr(B|A) + Pr(A)Pr(B/A)

Figure 3.19
                                                  3.6 Conditional Probability: Independence (Optional)   169

The result at the end of Example 3.47 — namely, for A, BC

Pr(B) = Pr(A)Pr(B|A) + Pr(A)Pr(BjA)

is referred to as the Law of Total Probability. Our next example shows how this result can
               be generalized.

Emilio is a system integrator for personal computers. As such he finds himself using key-
EXAMPLE 3.48
               boards from three companies. Company 1 supplies 60% of the keyboards, company 2
               supplies 30% of the keyboards, and the remaining 10% comes from company 3. From past
               experience Emilio knows that 2% of company 1’s keyboards are defective, while the per-
               centages of defective keyboards for companies 2, 3 are 3% and 5%, respectively. If one of
               Emilio’s computers is selected, at random, and then tested, what is the probability it has a
               defective keyboard?
                  Let A denote the event

A:   The keyboard comes from company 1.

Events B, C are defined similarly for companies 2, 3, respectively. Event D, meanwhile, is
                  D:   The keyboard is defective.
               Here we are interested in Pr(D). Guided by the Venn diagram in Fig. 3.20, we see that
               D=DNL=DN(AUBUC)=(DNA)U(DNB)U(DNC).                                            But here AN B=
               ANC = BMC =8. So now, for example, the Laws of Set Theory show us that (DM A)
               N(DNB)=DN(ANB)=DNB=H.              Likewise,   (DN A)N (DNC) =(DNB)N
               (DNC) =%,and(DN AYN(DNB)N(DNAC) = &. Consequently, by Theorem 3.9, we
               have

Pr(D) = Pr(DN A) + Pr(DN B) + Pr(DNC)
                                  = Pr(A)Pr(D|A) + Pr(B)P(D|B) + Pr(C)Pr(D|C).
               (Here we have the Law of Total Probability for three sets; that is, the sample space & is the
               union of three sets, any two of which are disjoint.)

Figure 3.20

From the information given at the start of this example we know that
                  Pr(A) = 0.6                 Pr(B) = 0.3                   Pr(C) = 0.1
                  Pr(D|A) = 0.02              Pr(D|B) = 0.03                Pr(D|C) = 0.05.
               So Pr(D) = (0.6)(0.02) + (0.3)(0.03) + (0.1)(0.05) = 0.026, and this tells us that 2.6%
               of the personal computers integrated by Emilio will have defective keyboards.
170          Chapter 3 Set Theory

The next example takes us back to the situation in Example 3.48 and introduces us to
                              Bayes’ Theorem. As with the Law of Total Probability, the situation here likewise general-
                              izes — that is, when appropriate, Bayes’ Theorem may be applied to any sample space F
                              that is decomposed into two or more events that are disjoint in pairs.

Referring back to the information in the preceding example, now we ask the question “If
      EXAMPLE 3.49
                              one of Emilio’s personal computers is found to have a defective keyboard, what is the
                              probability that keyboard came from company 3?”
                                 Using the notation in Example 3.48 we see that here the given condition is D and that
                              we want to find Pr(C|D).
                                    Pr(C|D) = Pr(C OD) _                  Pr(C)Pr(DIC)
                                      "         Pr(D)      Pr(A)Pr(D|A) + Pr(B)Pr(D|B) + Pr(C)Pr(DIC)
                                                         (0.1) (0.05)            0.005   5
                                               =                                              =           = — = 0.192308.
                                                   (0.6) (0.02) + (0.3)(0.03) + (0.1)(0.05)       0.026     26
                              [Before leaving this example let us observe a small point. Since we have a choice on how
                              to rewrite the numerator of Se         do we know we’ve made the correct choice? Yes!
                              The other choice, namely, Pr(C 1 D) = Pr(D)Pr(C|D), would tell us that Pr(C|D) =
                              Pr(CAD)     _ Pr(D)Pr(C|D) _
                               BBY        =     ne I?) = Pr(C|D), a correct but not very useful result.]

Having dealt with the Law of Total Probability and Bayes’ Theorem, it is now time to
                              settle the issue of independence. In our work on conditional probability we learned earlier
                              that for events A, B, taken froma sample space ¥, Pr(A B) = Pr(A)Pr(B|A). Should
                              the occurrence of event A have no effect on that of B, we have Pr(B|A) = Pr(B)—and
                              so event B is independent of event A. These considerations now guide us to the following.

Definition 3.12         Given a sample space & with events A, B C &, we call A, B independent when

Pr(AQ   B) = Pr(A)Pr(B).

For A, B C &, the general situation has Pr(B)Pr(A|B) = Pr(B    A) = Pr(AN B)=
                              Pr(A)Pr(B|A). Using this and the result in Definition 3.12 we now have three ways to
                              decide when A, 8 are independent:

1) Pr(AfN B) = Pr(A)Pr(B);
                                    2) Pr(A|B)     = Pr(A); or
                                    3) Pr(B|A) = Pr(B).

We also realize that A is independent of B if and only if B is independent of A.

Our next example uses the preceding discussion to decide whether two events are inde-
                              pendent.

Suppose Arantxa tosses a fair coin three times. Here the sample space             = {HHH, HHT,
      EXAMPLE 3.50
                              HTH, THH,       HTT, THT, TTH, TIT}, where each outcome has probability z:
                                    Consider the events
                                    A:   The first toss is H:    A = {HHH, HHT, HTH, HTT} and Pr(A) = 5;
                                    B:   The second toss is H:     8B = {HHH, HHT, THH, THT} and Pr(B) = 3;
                                                                    3.6 Conditional Probability: Independence (Optional)             171

C:      There are at leasttwo H’s:                C = {HHT, HTH, THH,          HHH}     and Pr(C)    = 5

a) A‘ B = {HHH, HHT},                    so      Pr(AN B) = 4 = (5) (5) = Pr(A)Pr(B).                       Conse-
                            quently, the events A, B are independent.
                     b) ANC = {HHH, HHT, HTH}, so P(A MC) = = # (3) ($) = Pr(A)Pr(C). There-
                            fore, the events A, C are not independent.
                      c) Likewise,         Pr(B OC) = 3 # (4) ($) = Pr(B)Pr(C)                         so    B,C     are   also      not
                            independent.
                     d) The event B = {TTT, TTH, HTT, HTH} and Pr(B) = 3. Further AN B = {HTH,
                            HTT} with Pr(AM B)               = + = (4) (4) = Pr(A)Pr(B). So not only are the events
                            A, B independent but the events A, B are also independent.

The first part of the following theorem shows us that what has happened here in parts
                   (a) and (d) is not an isolated instance.

THEOREM 3.10       Let A, B be events taken from a sample space &. If A, B are independent, then (a) A, B
                   are independent; (b) A, B are independent; and (c) A, B are independent.
                   Proof: [| We shall prove part (a) and leave the proofs of parts (b), (c) for the Section Exercises.]
                       Since      A=ANL=AN(BUB)=(ANB)U(ANB)                                           and    (ANB)N(ANB)=
                   AN(BN B) =ANB=B, wehave Pr(A) = Pr(AM B) + Pr(AN B). With A, B inde-
                   pendent, it follows that Pr(A M B) = Pr(A)Pr(B). The last two equations imply that
                   Pr(AQ B) = Pr(A) — Pr(AN B) = Pr(A) — Pr(A)Pr(B) = Pr(A)[1 — Pr(B)] =
                   Pr(A) Pr(B). Consequently, from Definition 3.12 we know that A, B are independent.

Our next example will help motivate the idea of independence for three events.

EXAMPLE 3.51    Tino and Monica each roll a fair die. If we let x denote the result of Tino’s roll and y that of
               :   Monica’s, then once again ¥ = {(x, y)|1 <x,                     y < 6}. Now consider the events A, B, C:

A:       Tino rolls a    1, 2, or 6.
                       B:      Monica rolls a 3, 4, 5, or 6.
                      C:       The sum of Tino’s and Monica’s rolls is 7.

Here Pr(A)        = <      = 5, Pr(B)      = G       = x, and Pr(C)      = 4    = fe Further,

AN B= {(a, b)\a € {1, 2, 6}, b € {3, 4, 5, 6}}, so [AN BJ = 12 and Pr(AN B) =
                       & = 5 = (5) (2) = Pr(A)Pr(B),so A, B are independent;
                      ANC        = {(1, 6), (2, 5), (6, l)}and Pr(ANC) = = = + = (5) (2) = Pr(A)Pr(C),
                      making A, C independent;
                      BNC = {(4, 3), (3, 4), (2, 5), (1, 6)} and Pr(BNC) = 4 = 5 = (2) (4) =
                      Pr(B)Pr(C), so B, C are also independent.
                   Finally,
                      AN BNC           = {(1, 6), (2, 5)} and P(ANBNC)=2=                             4 =(3) (3) ()=
                      Pr(A)Pr(B)Pr(C).

What has happened in Example 3.51] leads us to the following.
172          Chapter 3 Set Theory

Definition 3.13         For a sample space ¥ and events A, B, C C &, we say that A, B, C are independent if
                                    1) Pr(AN B) = Pr(A)Pr(B);
                                    2) Pr(ANC)           = Pr(A)Pr(C);
                                    3) Pr(B OC) = Pr(B)Pr(C); and
                                    4) Pr(ANBNOC)=                 Pr(A)Pr(B)Pr(C).

Looking back now at Example 3.51 we see that there we verified the independence of the
                              events A, B, C. But did we do too much? In particular, do we really need condition (4) in
                              Definition 3.13? Perhaps we may feel that the first three conditions are enough to insure the
                              fourth condition. But, perhaps, they are not enough. The next example will help us settle
                              this issue.

Adira tosses a fair coin four times. So in this case the sample space ¥ =
| EXAMPLE 3.52
                              {x1X2x3xX4|x, € {H, T}, | <i < 4}.
                                    Let A, B, C CY&           be the events:
                                    A:     Adira’s first toss is a tail (T);
                                    B:     Adira’s last toss is a tail (T); and
                                    C:     The four tosses yield two heads and two tails.

For these events we find that Pr(A) = a = 5 Pr(B) = * = $s and Pr(C) = xz (3) 7

In addition,
                                    Pr(AN B) = 4 = 4 = (4) (§) = Pr(A)Pr(B),
                                    Pr(ANC) = 7 = (5) (2) = Pr(A)Pr(C), and
                                    Pr(BNC)= 4% = (5)(3) = Pr(B)Pr(C).
                              However, AN BMC
                                           = {THHT} and Pr(ANBNC)=%=4                                        #4 ¥ = (5) (3) (3) =
                              Pr(A)Pr(B)Pr(C).                 So while the three events in Example 3.51 are independent, the three
                              events in this example are (mutually) independent in pairs — but not independent.

In closing this section we provide a summary of the probability rules and laws we have
                              learned in this and the preceding section.

Summary of Probability Rules and Laws
                                         1) The Rule of Complement: Pr(A) = 1— Pr(A)
                                         2) The Additive Rule: Pr(A U B) = Pr(A) + Pr{B) — Pr(An B).
                                            When A, B are disjoint, Pr(A U B) = Pr(A) + Pr(B).
                                                   ses    ‘           eye              Pr{An B)
                                         3) Conditional Probability: Pr(A|B) = TPB                   Pr(B) #0         |

4) Multiplicative Rule: Pr(A)Pr(B\A) = Pr(AN B) = Pr(B)Pr(A\B).
                                           When A, B are independent, Pr(A 9 B) = Pr(A)Pr(B).
                                                                        3.6 Conditional Probability: Independence (Optional)             173

5) The Law of Total Probability: Pr(B) = Pr(A)Pr(B\A) + Pr(A)Pr(BIA)
                                     6) The Law of Total Probability (Extended Version): If Ay, Az, ..., An © F, where
                                        n>3, A;     A; = @foralll <i <j <n, and = U%_, Aj, then for any event B,

Pr(B) = Pr(Ay)Pr(BlA\) ++ +++ Pr(Ay)Pr(BlA,) = > Pr(Aj) Pr(BlAi).
                                                                                                                   ix}

Pr(iAn B) =                Pr(A)Pr(BIA)
                                     7) Bayes’Theorem: Pr(A|B) =                  Pr(B)             Pr(A)Pr(BiA) + Pr(A)Pr(BlA)
                                     8) Bayes’     Theorem      (Extended    Version): If Ay, Ao, - .<> An SY,             where    n > 3,
                                            A; VA; = @ for all 1<i <j <n, and F = Ut, Aj, then for any event B, and
                                            each i <k <n,
                                                             Pr{A;y     B) _                        Pr(Ay)Pr(BiAx)
                                              Pr(A;|B) =        Pr(B)             Pr(Ay)Pr(BiAy) + «++ Pr(An,)Pr(BiAn)
                                                                            _        __Pr(Ax)Pr(BlA)
                                                                                  yn, Pr(A;)Pr(BlAjy)

b) Derek’s class is making extensive use of the CAS. What
                        943 ah       ARE)                                    is the probability Derek is taking discrete mathematics?

1. Recall that in a standard deck of 52 cards there are 12 pic-         5. Let & be the sample space for an experiment € and let A, B
ture cards — four each of jacks, queens, and kings. Kevin draws         be events from &. If A, B are independent, prove that
one card from the deck. Find the probability his card is a king                        Pr(AU B) = Pr(A) + Pr(A)Pr(B)
if we know that the card drawn is an ace or a picture card.
  2. Let A, B be events taken from a sample space &.
                                                                                                = Pr(B) + Pr(B)Pr(A).
If Pr(A) = 0.6, Pr(B) =0.4, and Pr(A UB) =0.7, find                       6. Ceilia tosses a fair coin five times. What is the probabil-
Pr(A|B) and Pr(A|B).                                                    ity she gets three heads, if the first toss results in (a) a head;
3. If Coach Mollet works his football team throughout August,          (b) a tail?
then the probability the team will be the division champion is            7. One bag contains 15 identical (in shape) coins      — nine of
0.75. The probability the coach will work his team throughout           silver and six of gold. A second bag contains 16 more of these
August is 0.80. What is the probability Coach Mollet works his          coins — six silver and 10 gold. Bruno reaches in and selects one
team throughout August and the team finishes as the division            coin from the first bag and then places it in the second bag. Then
champion?                                                               Madeleine selects one coin from this second bag.
4. The 420 freshmen at an engineering college take either                   a) What is the probability Madeleine selected a gold coin?
calculus or discrete mathematics (but not both). Further, both
                                                                             b) If Madeleine’s coin is gold, what is the probability
courses are offered providing either an introduction to a CAS
                                                                             Bruno had selected a gold coin?
(computer algebra system) or using such a system extensively
throughout the course. The results in Table 3.6 summarize how            8. A coin is loaded so that Pr(H) = 2/3 and Pr(T) = 1/3.
the 420 freshmen are distributed.                                       Todd tosses this coin twice.
                                                                            Let A, B be the events
  Table 3.6
                                   CAS                CAS                   A:     The first toss isa tail.   B:     Bothtosses are the same.
                              (Introduction)       (Extensive
                                                   Coverage)            Are A, B independent?
                                                                         9. Suppose that A, B are independent with Pr(A U B) = 0.6
  Calculus                          170               120
                                                                        and Pr(A) = 0.3, Find Pr(B).
  Discrete Mathematics              80                 50
                                                                        10. Alice tosses a fair coin seven times. Find the probability
    a) If Sandrine is taking calculus, what is the probability          she gets four heads given that (a) her first toss is a head; (b) her
    her class is only being introduced to the use of a CAS?             first and last tosses are heads.
174                Chapter 3. Set Theory

11. Paulo tosses a fair coin five times. If A, B denote the events                C:    The tosses result in one head and one tail.

A:    Paulo gets an odd number of tails.                                 Are the events A, B, and C independent?
      B:    Paulo’s first toss is a   tail.                                    20. Three missiles are fired at an enemy arsenal. The probabil-
                                                                               ities the individual missiles will hit the arsenal are 0.75, 0.85,
are A, B independent?
                                                                               and 0.9. Find the probability that at least two of the missiles hit
12. The probability that a certain mechanical component fails                  the arsenal.
when first used is 0.05. If the component does not fail immedi-
                                                                               21. Dustin and Jennifer each toss three fair coins. What is the
ately, the probability it will function correctly for at least one
                                                                               probability (a) each of them gets the same number of heads?
year is 0.98. What is the probability that a new component func-
                                                                               (b) Dustin gets more heads than Jennifer? (c) Jennifer gets more
tions correctly for at least one year?
                                                                               heads than Dustin?
13. Paul has two coolers. The first contains eight cans of cola
                                                                               22. Tiffany and four of her cousins play the game of “odd person
and three cans of lemonade. The second cooler contains five
                                                                               out” to determine who will rake up the leaves at their grand-
cans of cola and seven cans of lemonade. Paul randomly se-
                                                                               mother Mary Lou’s home. Each cousin tosses a fair coin. If the
lects one can from the first cooler and puts it into the second
                                                                               outcome for one cousin is different from that of the other four,
cooler. Five minutes later Betty randomly selects two cans from
                                                                               then this cousin has to rake the leaves. What is the probability
the second cooler. If both of Betty’s selections are cans of cola,
                                                                               that a “lucky” cousin is determined after the coins are flipped
what is the probability Paul initially selected a can of lemonade?
                                                                               only once?
14. Let & be the sample space for an experiment © and let
                                                                               23. Ninety percent of new airport-security personnel have had
A,B,C        CY.    If events   A, B     are   independent,   events   A, C
                                                                               prior training in weapon detection. During their first month on
are disjoint, and events B, C are independent, find Pr(B)                 if
                                                                               the job, personnel without prior training fail to detect a weapon
Pr(A) = 0.2, Pr(C) = 0.4, and Pr(AU BUC) =0.8.
                                                                               3% of the time, while those with prior training fail only 0.5%
15. An electronic system is made up of two components con-                     of the time. What is the probability a new airport-security em-
nected in parallel. Consequently, the system fails only when                   ployee, who fails to detect a weapon during the first month on
both of the components fail. The probability the first component               the job, has had prior training in weapon detection?
fails is 0.05 and, when this happens, the probability the second
                                                                               24. The binary string 101101, where the string is unchanged
component fails is 0.02. What is the probability the electronic
                                                                               upon reversing order, is called a palindrome (of length 6). Sup-
system fails?
                                                                               pose a binary string of length 6 is randomly generated, with 0,
16. Gayla has a bag of 19 marbles of the same size. Nine of                    1 equally likely for each of the six positions in the string. What
these marbles are red, six blue, and four white. She randomly                  is the probability the string is a palindrome if the first and sixth
selects three of the marbles, without replacement, from the bag.               bits (a) are both 1; (b) are the same?
What is the probability Gayla has withdrawn more red than
                                                                               25. In defining the notion of independence for three events
white marbles?
                                                                               we found (in Definition 3.13) that we had to check four con-
17. Let A, B, C be independent events taken from a sample                      ditions. If there are four events, say E,, E2, F3, E4, then we
space ¥. If Pr(A) = 1/8, Pr(B) = 1/4, and Pr(AU BUC)                           have to check 11 conditions
                                                                                                       — six of the form Pr(E; N Ej) =
= 1/2, find Pr(C).                                                             Pr(£,)Pr(E,),1<i<j <4; four of the form Pr(E,N
18. Acompany involved in the integration of personal comput-                   E;    Ey) = Pr(E;)Pr(E,)Pr(Ey), |<i<j <k <4;             and
ers gets its graphics cards from three sources. The first source               Pr(E, 0 £,.90 E39 £4) = Pr(£\) Pr (£2) Pr( £3) Pr (£4).
provides 20% of the cards, the second source 35%, and the third                (a) How many conditions need to be checked for the indepen-
source 45%, Past experience has shown that 5% of the cards                     dence of five events? (b) How many for n events, where n > 2?
from the first source are found to be defective, while those from              26. Let A, B be events taken from a sample space F. If Pr(AN
the second and third sources are found to be defective 3% and                  B) =0.1 and Pr(A MB) = 0.3, whatis Pr(A A BIAUB)?
2%, respectively, of the time.
                                                                               27. Urn | contains 14 envelopes (of the same size) — six each
       a) What percentage of the company’s graphics cards are                  contain $1 and the other eight each contain $5. Urn 2 contains
       defective?                                                              eight envelopes (of the same size as those in urn 1)— three
       b)   If a graphics card is selected and found to be defective,          each contain $1 and the other five each contain $5. Three en-
       what is the probability it was provided by the third source?            velopes are randomly selected from urn | and transferred to urn
                                                                               2. If Carmen   now draws one envelope from urn 2, what is the
19. Gustavo tosses a fair coin twice. For this experiment con-
                                                                               probability her selection contains $1?
sider the following events:
                                                                               28. Let A, B be events taken from a sample space & (with
      A:    The first toss is a head.
                                                                               Pr(A) >Oand Pr(B) > 0). If Pr(B|A) < Pr(B), prove that
      B:    The second toss is a tail.                                         Pr(A|B) < Pr(A).
                                                                                3.7 Discrete Random Variables (Optional)                  175

29. Let A, B be events taken from a sample space ¥. If                30. Let
                                                                            ¥ be the sample space for an experiment
                                                                                                               €, with events
Pr(A) = 0.5, Pr(B) = 0.3, and Pr(A|B) + Pr(BlA) = 0.8,                A, BCS. Tf Pr(A|B) = Pr(A A B)=0.5 and Pr(A U B)
what is Pr(AN   B)?                                                   = 0.7, determine Pr(A) and Pr(B).

3.7
   Discrete Random Variables (Optional)
                          In this section we introduce a fundamental idea in the study of probability and statistics —
                          namely, the random variable. Since we are dealing exclusively with discrete sample spaces,
                          we shall deal only with discrete random variables. Consequently, whenever the term ran-
                          dom variable arises, it is understood that it is a discrete random variable — that is, a random
                          variable defined for a discrete sample space. [Those interested in continuous random vari-
                          ables should consult the chapter references. Chapter 3 of the text by John J. Kinney [7] is
                          an excellent starting point.]
                             We introduce the concept of a random variable in an informal way. The following
                          example will help us do this.

| EXAMPLE 3.53            If Keshia tosses a fair coin four times, the sample space for this random experiment may
                          be given as

¥ = {HHHH,
                                      HHHT, HHTH, HTHH, THHH,
                                      HHTT, HTHT, HTTH, THHT, THTH, TTHH,
                                     HTTT, THTT, TTHT, TTTH,
                                     TTTT}.
                          Now, for each of the 16 strings of H’s and T’s in ¥, we define the random variable X as
                          follows:
                             For x1.x2%3x4 € F, X(x%1x2x3x4) counts the number of H’s that appear among                          the four
                          components x), «2, x3, x4. Consequently,

X (HHHH)       = 4,
                              X (HHHT)       = X (HHTH)           = X (HTHH)    = X (THHH)       = 3,
                              X (HHTT) = X (HTHT)             = X(HTTH)        = X(THHT)        = X(THTH)        = X(TTHH)         = 2,
                              X (HTTT) = X(THTT)              = X(TTHT)        = X(TTTH)      = 1, and
                              X(TTTT)      = 0.

We see that X associates’ each of the 16 strings of H’s and T’s in & with one of the
                          nonnegative integers in {0, 1, 2, 3, 4} (asubset of R). This allows us to think of an outcome
                          in & in terms of a real number. Further, suppose we are interested in the event

A:     the four tosses result in two H’s and two T’s.

This association by X between the strings in      and the nonnegative integers 0, 1, 2, 3, 4 is an example of
                          a function — an idea to be covered in detail in Chapter 5. In general, a random variable is a function from the
                          sample space ¥ of an experiment € to R, the set of real numbers. The domain of any random variable X is f and
                          the codomain is always R. The range in this case is {0, 1, 2, 3, 4}, (The concepts of domain, codomain, and range
                          are formally defined in Section 5.2.)
176         Chapter 3 Set Theory

In our earlier work we might have described this event by writing

A = {HHTT, HTHT, HTTH, THHT, THTH, TTHH}.

Now we can summarize the six outcomes in this event by writing A =
                             {x1x2x3x4|  X (x1X2x3X4) = 2}, and this may be abbreviated to A = {x,x2x3x4|X = 2}. Also,
                             we express Pr(A), in terms of the random variable X, as Pr(X = 2). So here we have
                             Pr(A) = Pr(X = 2) = 6/16 = 3/8. Similarly, it follows that Pr(X = 4) = 1/16 since
                             there is only one outcome for this case  — namely, HHHH.
                                 The following provides what we call the probability distribution for this particular ran-
                             dom variable X.
                                   x          Pr(X =x)
                                   0              1/16
                                    1         4/16  = 1/4
                                   2           6/16 = 3/8
                                   3          4/16  = 1/4
                                   4              1/16

Observe how S45 Pr(X = x) = | in agreement with axiom (2) of Section 3.5. Also, it
                             is understood that Pr(X = x) = 0 for x #0, 1, 2, 3, 4.

Let us now reinforce what we have learned by considering a second example.

Suppose Giorgio rolls a pair of fair dice. This experiment was examined earlier —for
      EXAMPLE 3.54
                             instance, in Examples 3.33 and 3.45. The sample space here comprises 36 ordered pairs
                             and may be expressed as F = {(x, y)|1 <x <6,1<y <6}.
                                 We define the random variable X, for each ordered pair (x, y) in &, by X((x, y)) =
                             x + y, the sum of the numbers that appear on the (tops of) two fair dice. Then X takes on
                             the following values:
                                    X((,   1)) =2
                                   X (C1, 2)) = X((2, 1) = 3
                                   X((1, 3)) = X((Q, 2)) = X(G3, D) =4
                                   X((1, 4)) = X((2, 3)) = X(G, 2)) = X(4, 1) =5
                                   X((1, 5)) = X((2, 4)) = X((G3, 3)) = X((4, 2)) = XS, 1) = 6
                                   X (C1, 6)) = X((2, 5)) = X((3, 4)) = X((4, 3)) = XS, 2)) = X(6, 1) = 7
                                   X((2, 6)) = X((3, 5)) = X(4, 4) = XS, 3)) = X((6, 2)) = 8
                                   X((3, 6)) = X((4, 5)) = X(G5, 4)) = X((6, 3)) = 9
                                   X ((4, 6)) = X((5, 5)) = X((6, 4)) = 10
                                   X((5, 6)) = X((6, 5)) = 11
                                   X((6, 6)) = 12
                             The probability distribution for X is as follows:
                                    Pr(X = 2) = 1/36           Pr(X = 6) = 5/36             Pr(X = 10) = 3/36
                                    Pr(X = 3) = 2/36           Pr(X = 7) = 6/36             Pr(X = 11) = 2/36
                                    Pr(X = 4) = 3/36           Pr(X = 8) = 5/36             Pr(X = 12) = 1/36
                                   Pr(X =5) = 4/36             Pr(X = 9) = 4/36
                                                                            3.7. Discrete Random Variables (Optional)                177

This can be abbreviated somewhat by

x-—l
                                                                                       x =2,3,4,5,6,7
                                                                36”
                                       Pr(X =x)=
                                                              12 — (x — 1)
                                                                                       x = 8,9, 10, 11, 12.
                                                                      36

Note that       )0'2., Pr(X =x) = 1.
                      Having finished with describing X and its probability distribution, now let us consider
                  the events:
                      B:    Giorgio rolls an 8 — that is, the sum of the two dice is 8.
                      C:    Giorgio rolls at least a 10.
                  The event B = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)} and Pr(B) = Pr(X = 8) = 5/36.
                  Meanwhile C = {(4, 6), (5, 5), (6, 4), (5, 6), (6, 5), (6, 6)} and Pr(C) = 6/36 = 3/36
                  + 2/36 + 1/36 = Pr(X = 10) + Pr(X = 11) + Pr(X = 12) = Prl0<X <12)=
                     12g P(X =x) = Vo ny Pr(X = x).

The preceding two examples have shown us how a random variable may be described by
                  its probability distribution. Now we shall see how a random variable can be characterized
                  by means      of two measures —its            expected value, a measure            of central tendency,       and its
                  variance, a measure of dispersion.

When a fair coin is tossed 10 times, our intuition may suggest that we expect to get
                  five heads and five tails. Yet we know that we could actually see 10 heads, although the
                  probability for this outcome is only (i9y(2)"° = 3G57 = 0.000977, while the probability
                  for five heads and five tails is substantially higher as yayay = a = 0.246094.
                  Similarly, we may want to know how many times we might expect to see a 6 when a fair
                  die is rolled 50 times. To deal with such concerns we introduce the following idea.

Definition 3.14   Let X be a random variable defined for the outcomes in a sample space ¥. The mean, or
                  expected value, of X is

E(X) =          > x- Pr(X =x),

where the sum is taken over all the values x determined by the random variable X*.

The following example deals with F(X) in several different situations.

T One finds the terms mean (value) and expectation also used to describe F(X), as well as the alternate notation
                  j4y.. Further, although our discussion deals solely with finite sample spaces, the above formula is valid for countably
                  infinite sample spaces, so long as the infinite sum converges.
178         Chapter 3 Set Theory

a) If a fair coin is tossed once and X counts the number of heads that appear, then
      EXAMPLE 3.55
                                                                                                                        X(T) = 0,                                  Pr(X =0) = Pr(X =1) =

wl co
                                      Yy ={H,T},                         X(H) = 1,                                                                                                                                                    3

l                                                            1     1   1
                                                                                 0-5 41-5
                                                           and = E(X) = Dox Pr(X=x)=                                                                                                              2

x=0

Note that £(X) is neither 0 nor 1.
                               b) If one fair die is rolled, then ¥ = {1, 2, 3, 4, 5, 6}. Further, for each 1 <i <6, we
                                  have X(i) =i and Pr(X =i) = 1/6. So here
                                                               6                                                              ]                        1                1            ]                 1              1
                                                                                                                       4.—E45: 5-=- E4+6°%
                                           E(X)
                                                  =   =
                                                              Dos        » Prix
                                                                               r(X      =
                                                                                                 =x) =|.- et2-E43-Zt4-
                                                                                                             Q2.- 3.-                   6:-

-(+                 (l+2+4+                              46) =                -!
                                                                                                                          6                                                              6        2°

Note, once again, that E(X) is not among the values determined by the random vari-
                                     able X.
                                   c) Suppose now we have a loaded die, where the probability of rolling the number i
                                     is proportional to i. As in part (b), Ff ={1, 2, 3, 4,5, 6} and X(@i) =i                                                                                              for 1 <i <6.
                                     However, here, if p is the probability of rolling 1, then ip is the probability of roll-
                                     ing   i, for each              of the           other       five outcomes                                    /,         where          2 <i     <6.      From           axiom              (2),
                                      1=      So, ip = pU+2+--:+6)                                                       =21p,                so p=1/21                            and       Pr(X =1) = 1/21,
                                      1 <i <6. Consequently,

1      2   6
                                                            215 42-54-46
                                             E(X) = So x-Pr(X=x=1- 21
                                                                   x=]
                                                                         21
                                                                                                                        144494 164254+36                                                      91 _ 13
                                                                                                                               21                                                              213°
                                   d) Consider the random variable X in Example 3.53, where a fair coin was tossed four
                                      times. Then here

E(X)               yx              Pr(X                           =x)        =0                 +1]               + +2           ——
                                                                                                                                                                                  6 +3        —
                                                                                                                                                                                               + +4            __—_

16
                                                                                                                              .   ——                       .V—_—

16
                                                                                             =       xX             —

16
                                                                           »

16
                                                          =

~                                                                  16
                                                                                                                    04441241244 _ 2.
                                                                                                                          16
                                      In this case E(X) is found to be among the values determined by the random vari-
                                      able X.
                                   e) Finally, for Example 3.54, where Giorgio rolled a pair of fair dice, we find that

E(X)           =2              4.3           *        +                      +7             °      48                   >               +]]                412                 ,
                                                                    36                 36                                              36                          36                          36                 36
                                                            252
                                                          = 2 =7,
                                                             36

Before continuing let us recall from Section 3.5 that a Bernoulli trial is an experiment
                              with exactly two outcom          es
                                                          — success,     with probability p, and failure, with probability
                              q = 1 — p. When such an experiment      is  performed n times, and the outcome of any one
                                                                                        3.7 Discrete Random Variables (Optional)                                     179

trial is independent of the outcomes of any previous trials, then the probability that there
               are (exactly) k successes among the 77 trials is ({) pg”“*, O<k <n.
                  Now if we consider the sample space of all 2” possibilities for the n outcomes of these
               n Bernoulli trials, then we can define the random variable X, where X counts the number
               of successes among the n trials. Under these circumstances X is called a binomial random
               variable and

Pr(X =x) = (")orar                                                     x=0,1,2,..
                                                                 XxX

This probability distribution is called the binomial probability distribution and it is com-
               pletely determined by the values of n and p. Further, it is precisely the type of probability
               distribution that occurs in Example 3.53, where we regard an H as a success and find that

Pr(X =0) =             = (3) (3) (3)                                               Prk =3)= 4                           (5) (3) (5)
                  rns ()(3) (3) rormo=e=()(5) (3)
                                 {|

]

|
                  Pr(X =2)
                                       5~() (3) (2)
                                 Ir

I

The five previous results can be summarized by
                                                           4           1           x        ]    4-x
                                 Pr(iX   =x)=         (")          (5)                  (5)               ,            x=0,1,...,            4,

But why should we be bringing all of this up here in the discussion on the expected value of
               a random variable? At this point notice that in part (d) of Example 3.55 we found E(X) = 2,
               where X is the binomial random variable described above. For this binomial random variable
               X wehaven = 4andp = 1/2. Is it justa coincidence        here that F(X) = 2 = (4)(1/2) = np?
                   Suppose we were to roll a fair die 12 times and ask for the number of times we expect to
               see a5 come up. Here the binomial random variable X would count the number of times a 5
               is rolled among the 12 rolls. Our intuition might suggest the answer is 2 = (12)(1/6) = np.
               But is this once again £(X) for this binomial random variable X ? Instead of verifying this
               result directly — by using the formula in Definition 3.14, we shall obtain the result from
               the following theorem.

THEOREM 3.11   Let X be the binomial random                    variable that counts the number                                      of successes, each with
               probability p, among n Bernoulli trials. Then E(X) = ap.
               Proof: From Definition 3.14 we have

E(X) =o x- Pr(X=x)=)° “(") pq’
                                                     x=0                                                  x=0
                                                                                                                       x

where g = 1 — p. Since x(") p* ‘q"~* = 0 when x = 0, it follows that

E(X)=) ox‘(ee                                            Tones
                                   .            xX    AX                   .                    n!               xX    AA K

x=]                                   x=]

—                                                 X       HX                                  (n   a   1)!           xXx—-]       n-x

=»x=] e's                                                               np   x=1
                                                                                                                      (—-Din—x)!?                          4
180          Chapter 3 Set Theory

n—|}
                                                               n—-l          ,       ;                      os
                                           np SO> , (      y i                    OY,           upon substituting y = x — 1,
                                              yao Yle — (y + DE                                 and realizing that y varies from 0 to
                                                                                                n — | when x varies from | ton
                                                n—]
                                                              —]
                                         = np >         (" y       rae    )-¥ = np(p+q)"|,                by the binomial theorem

=np,          sincep+gq=1.

As a result of Theorem 3.11 we now know that upon rolling a fair die 12 times the num-
                              ber of 5’s we expect to see is (12)(1/6)            = 2, as our intuition suggested earlier. Better still,
                              should we roll this fair die 1200 times and let the random variable Y count the num-
                              ber of 5’s that appear, then Y is a binomial random variable with n = 1200, p = 1/6, and
                              Pr¥=y)=            (70°) (2)? (gyre        * y=0,1,2,...,          1200. Further, instead of trying to
                              determine E(Y) by actually calculating                     rai y(1200)(4)   ( sy i200")*, we obtain E(Y) =
                              np = (1200)(1/6)           = 200, quite readily from Theorem 3.11.

Having dealt with the concept of the mean, or expected value, of a random variable X,
                              we turn now to the variance of X —a measure of how widely the values determined by
                              X are dispersed or spread out. If X is the random variable defined on the sample space
                              Sx = {a, b,c}, where X(a) = —1, X(b) = 0, X(c) = 1, and Pr(X = x) = 1/3, forx =
                              ~1,0, 1, then E(X) =0. But then if Y is the random variable defined on the sample
                              space fy = {r, s,t, u, v}, where Y(r) = —4, Y(s) = —2, Y(t) = 0, Y@) = 2, Y(v) =
                              and Pr(Y = y) = 1/5, for y = —4, —2, 0, 2, 4, we get the same mean   — thatis, E(Y) =
                              However,     although       £(X)     = E(Y), we can see that the values determined by Y are more
                              spread out about the mean of 0 than the values determined by X. To measure this notion of
                              dispersion we introduce the following.

Definition 3.15         Let & be the sample space for an experiment € and let X be a random variable defined on
                              the outcomes in ¥. Suppose further that E(X) is the mean, or expected value, of X. Then
                              the variance of X, denoted o7, or Var(X), is defined by

of = Var(X) = E(X — E(X))* = So                       — E(X))*?- Pr(X = x),

where the sum is taken over all the values of x determined by the random variable X.
                                    The standard deviation of X, denoted cx, is defined by

x = V Var(X).

Now let us apply Definition 3.15 in the following.

Let X be the random variable defined on the outcomes of the sample space ¥ = {a, b, c, d},
      EXAMPLE 3.56
                              with X(a) = 1, X(b) = 3, X(c) = 4, and X(d) =
                                    Suppose the probability distribution for X is
                                    x           Pr(X      =x)
                                     1                 1/5
                                    3                  2/5
                                    4                   1/5
                                    6                  1/5.
                                                            3.7 Discrete Random Variables (Optional)   18]
                Then

1              2                       17
                                       E(X)
                                         = 1-2 43.               +4--46.-=


                                                                    Mle
                                                             5                       5

iil
                and

Var(X)   = E(X   — E(X))?

0-90)
                                    8) (3) 0
                                  2)O)OGEOQ
                        SO
                                  (2)(0)-8
                              Ox = V Var(X) =
                                                  661        .
                                                  5 = 5 ¥ 6 = 1.624808.

Our next result provides a second way by which we can compute Var(X).

THEOREM 3.12   If X is a random variable defined on the outcomes of a sample space
                                                                                   Y, then
                                             Var(X) = E(X*) — [E(X)p.
               Proof: From Definition 3.15 we know that

Var(X) = E(X — E(X))* = ¥ “(x — E(X))?- Pr(X = x).

Expanding within the summation we have

Var(X) = }°(x? — 2x E(X) + [E(X) 2)» Pr(X = x)

= $0 x? Pr(X =x) — 2E(X) Sox: Pr(X =x)

+[E(X)P > Pr(X =x),            because E(X) is a constant
                        = E(x) - DE COE              +[E(X)}’,     because 5° x - Pr(X = x) = E(X)

and    > Prix     =x)=1

= E(X*) —[E(X)?.

Let us check the result for Var(X) in Example 3.56 by using Theorem
                                                                                    3.12.
182         Chapter 3 Set Theory

The information in Example 3.56 provides the following:
      EXAMPLE 3.57
                                                      x?                Pr(x      =x)

aA pPWre&
                                                        1                   1/5
                                                        9                   2/5
                                                      16                    1/5
                                                      36                    1/5

So E(X*) = Yo, x? Pr(X = x) = (1) (5) +) (2) + 16) (5) + BO) (5) = F.
                                    Earlier in Example 3.56 we learned that E(X) = 17/5. Consequently, from Theorem
                             3.12 we have Var(X) = E(X*) —[E(X)? = 2 - (22)° = (54) (355 — 289) = &, as we
                             found earlier.

We'll use the formula of Theorem 3.12 a second time in the following.

EXAMPLE 3.58|          In Example 3.53 we studied the random variable X, which counted the number of heads
                             that result when a fair coin is tossed four times. Soon thereafter we learned that X was
                             a binomial random variable with n = 4, p= 1/2,g =1—p=1/2, and Pr(X =x)=
                             (3)         (GG)   (4)            ,x = 0, 1, 2, 3,4.       Further,   in part (d) of Example   3.55   we   found   that
                             E(X) = 2 (= np, as we learned later in Theorem 3.11). To compute Var(X) we use the
                             formula in Theorem 3.12, but first we consider the following.
                                     x                x?                Pr(X =x)
                                                        0                   1/16
                                   WN ©

l               4/16
                                                        4                  6/16
                                                        9                  4/16
                                     4                16                   1/16
                            Using these results we find that E(X*) =                               }°4_, x? Pr(X =x) =0- a 41. 4 44.816

+
                             9-4 416-4 = 8 =5. So Var(X)= E(X”) — [E(X)P
                                                                     = 5 - (2)? = 1 = 4                                                  (5) (4)

Il
                             npq, aresult that is true in general. Further, for this random variable, the standard deviation
                             ox          = 1.

As we mentioned, the preceding example contains an instance of a more general result.
                             We state that result now in our next theorem and outline a proof for this theorem in the
                             Section Exercises.

THEOREM 3.13                 Let X be the binomial random                          variable that counts the number     of successes,    each with
                             probability p, among n independent Bernoulli trials. Then Var(X) = npq andoy = ./npq,
                             where g = 1 — p.

As aresult of Theorems 3.11 and 3.13 we now find that our next example requires little
                             calculation.

Due to top-notch recruiting, Coach Jenkins’ baseball team has probability 0.85 of winning
      EXAMPLE 3.59
                             each of the 12 baseball games it will play during the spring semester. (Here the outcome of
                             each game is independent of the outcome of any previous game.)
                                                                        3.7 Discrete Random Variables (Optional)                  183

Let    X     be the random      variable that counts          the number      of games      Coach        Jenkins’
                  team     wins     during    the   spring     semester.    Then      Pr(X = x) = (17)(0.85)*(0.15)!?™,
                  x =0,1,2,...,12.             Further,      with   n=12        and     p=0.85,        we     readily     see     that
                  E(X) =     012, x(12)(0.85)* (0.15)!2-* = np = 12(0.85) = 10.2 and Var(X) =
                    22 o(x — 10.2)7(17)(0.85)* (0.15)!2-* =    SO?9 x? (17) (0.85)* (0.15)!2-* — (10.2)? =
                  npq = (12)(0.85)(0.15) = 1.53.

A word of warning! The preceding example shows how easy it is to compute £(X)
                  and Var(X) for a binomial random variable X, once we know the values of n and p. But
                  remember, the formulas in Theorems 3.11 and 3.13 are valid only when the random variable
                  X is binomial.

Before we introduce the last idea for this section we shall consider an example in order
                  to motivate and illustrate the idea.

Referring back to Example 3.59, at this point we want to determine a /ower bound for
   EXAMPLE 3.60
                  the probability that the random variable X is within & standard deviations oy of the mean
                  E(X), for k = 2,3. When k = 2 we find that Pr(E(X) — 2o0y < X < E(X) + 20x) =
                  Pr(|X — E(X)| < 20x). From the calculations in Example 3.59 we know that E(X) =
                  10.2 and Var(X) = 1.53, so oy = V7 1.53 = 1.236932. Consequently, Pr(|X — E(X)| <
                  2ox) = Pr(10.2 — 2(1.236932) < X < 10.2 +. 2(1.236932)) = Pr(7.726136 < X <
                  12.673864) = Pr(X = 8) + Pr(X =9)+---+ Pr(X = 12) =
                     22, (17) (0.85)* (0.15)'2-* = 0.068284 + 0.171976 + 0.292358 + 0.301218 +
                  0.142242 = 0.976078.
                     Likewise, for k = 3, Pr(|X — E(X)| < 30x) = Pr(6.489204 < X < 13.910796) =
                  Pr(X = 7) + Pr(X = 8)+---+ Pr(X = 12) = 0.019280 + 0.068284+ --- +
                  0.142242 = 0.995358.
                     But where is this lower bound that we mentioned at the start of our discussion? Looking
                  at the results for k = 2, 3 once more, we see that Pr(|X                  — E(X)|     < 20x) = 0.976078 >
                  3 = 1— 4 and Pr(|X — E(X)| <3ox) = 0.995358 > § = 1— 4. So
                                             Pr(|X — EQO| Skow) = 1-5,                      fork = 2, 3.

Further, although this lower bound is on the crude side, our next result will show that it is
                  true for any positive real number k. In addition, the result is true for any random variable
                  X, not just a binomial random variable like the one we have used here.

THEOREM 3.14      Chebyshev’s Inequality. Let Ff be the sample space for an experiment @ and let X be a
                  random variable defined on the outcomes in &. If E(X) is the mean of X and oy its
                  standard deviation, then for any k > 0,
                                                                                                                          1
                           Pr(E(X) — koxy < X < E(X) + kox) = Pr(|X — E(X)| < kox) = 1 - Be
                  [Here, as in Example 3.60, X accounts for those x values where x = X(s) for some s € F
                  and |x — E(X)| < koy.]
                  Proof: The proof presented here is for X discrete.’ However, the result is also true for
                  continuous random variables.

"The proof presented here is valid for the case where the sample space is countably infinite, so long as all the
                  summations converge.
184         Chapter 3 Set Theory

Let A, B be the following subsets of R.

A= {x||x — E(X)|>kox}                B= {x||x — E(X)|
                                                                                                     < kox}
                             (Note that A, B are not necessarily events for they need not be subsets of . They are
                             subsets of the set of real numbers determined by the random variable X.)
                                We know that

Var(X) = 02 =        xe: — E(X))*Pr(X = x)
                                                          x

= Se = E(X)) Pr(X =x) + Sor = E(X)YPPr(X = x)
                                                  xEA                                 xeB

> SOx — E(X))Pr(X =x),                    as So — E(X))P Pr(X = x) > 0.
                                                  xEA                                        xEB

For     x € A, |x — E(X)|         >koy   and   so   it follows    that   here   |x — E(X)| >   koy.    Since
                            (x — E(X))? = |x — E(X)|? we now have

og > S° |x — E(X)PPr(X = x) > Po? y- Pr(X =x),                            and
                                            xEA                                         xEA

oy > Pog S > Pr(X = x) > of > of Pr(\X — E(X)| > kox)
                                                    xEA
                                               1                           l
                                            > aq = Pr(lX — E(X)| > kox) = - < ~Pr(|\X ~ E(X)| > koy)
                                                1
                                           = 1-5 <1 — Pri|X ~ E(X)| > kox)
                                                1
                                           = 1-5 < Pr(|X — EQX0)| <kox).

Our last example for this section shows how one might apply Chebyshev’s Inequality.

EXAMPLE 3.61                 ina ieis selling
                            Angelica        cellj   boxes : of candy for h her choir’s
                                                                                   tea     ve
                                                                                       Christmas :         iser.
                                                                                                   fund raiser.  T       :
                                                                                                                   The pieces :
                                                                                                                                of candy
                            are packed into each box so that the mean number of pieces is 125 with a standard deviation
                            of 5 pieces. To find a lower bound on the probability that a box of Angelica’s candy contains
                            between 118 and 132 pieces we proceed as follows.
                                Here the random variable X counts the number of pieces of candy in a box, with E(X) =
                            125 and ox = 5. Applying Chebyshev’s Inequality we have

Pr(118 < X < 132) = Pr(118— 125 < X — 125 < 132 — 125)
                                                          = Pr(—7 < X — 125 <7) = Pr{|X — 125| <7)
                                                                             7            |   25  24
                                                              pr (ix~ecoi< (2) ox) > 1-35 =1- 3 = 3
                                                                                        5
                            Consequently, the probability that a box of Angelica’s candy contains between 118 and
                            132 pieces is at least 24/49 = 0.489796. (Note here that the value of k in Chebyshev’s
                            Inequality is 7/5, which is not an integer.)
                                                                                                         3.7 Discrete Random Variables (Optional)                          185

10. A carnival game invites a player to select one card from a
                                 EXERCISES 3.7                                             standard deck of 52 cards. If the card is a seven or a Jack the
                                                                                           player is given five dollars. For a king or an ace the player is
1, Let X be a random variable with the following probability                              given eight dollars. The other 36 cards result in the player los-
distribution.                                                                              ing. How much should one be willing to pay to play this game so
                        x             |0O    1       2    3          4                     that it is fair— that is, so that the expected value of the player’s
                                                                                           net winnings is 0?
                                                     i    1          1
                 prix=x)              | }    4       4    4          8                     11. The route that Jackie follows to school each day includes
Determine (a) Pr(X = 3); (b) Pr(X <4); (c)                                Pr(X > 0);       eight stoplights. When she reaches each stoplight, the proba-
(d) Pr(l < X < 3); (e) Pr(X = 2|X < 3); and                                                bility that the stoplight is red is 0.25 and it is assumed that the
(f) Pr(X < lorX = 4).                                                                      stoplights are spaced far enough apart so as to operate indepen-
  2. The probability distribution for a random variable X                                  dently. If the random variable X counts the number of red stop-
is given by Pr(X =x) = (3x 4+ 1)/22, x =0, 1, 2, 3. De-                                    lights Jackie encounters one particular day on her ride to school,
termine (a) Pr(X = 3); (b)Pr(X <1); (c) Prd < X < 3);                                      determine (a) Pr(X = 0); (b) Pr(X = 3); (c)                             Pr(X > 6);
(d) Pr(X > —2); and (e) Pr(X = 1|X <2).                                                    (d) Pr(X > 6|X > 4); (e) E(X); and (f) Var(X).
3. Ashipment of 120 graphics cards contains 10 that are defec-                            12. Suppose that a random variable X has mean E(X) = 17
tive. Serena selects five of these cards, without replacement, and                         and variance Var(X) = 9, but its probability distribution is
inspects them to see which, if any, are defective. If the random                           unknown. Use Chebyshev’s Inequality to estimate a lower
variable X counts the number of defective graphics cards in Ser-                           bound for (a) Pr(11 < X < 23); (b) Pr(l0< X < 24); and
ena’s selection, determine (a) Pr(X = x), x =0,1,2,...,5;                                  (c) Pr(8 < X < 26).
(b) Pr(X = 4); (c) Pr(X > 4); and (d) Pr(X = 1|X <2).                                      13. Suppose that a random variable X has mean E(X) = 15
  4. Connie tosses a fair coin three times. If X = X, — X2,                                and variance Var(X) = 4, but its probability distribution is un-
where X, counts the number of heads that result and X> counts                              known. Use Chebyshev’s Inequality to find the value of the
the number of tails that result, determine (a) the probability dis-                        constant c where Pr(|X           — 15| <c)        > 0.96.
tributions for X,, X2, and X; and (b) the means E(X,), E(X2),                              14. Fred rolls a fair die 20 times. If X is the random variable
and E(X).                                                                                  that counts the number of 6’s that come up during the 20 rolls,
  5. Let X be the random variable where Pr(X = x) = 1/6                                    determine F(X) and Var(X).
for x = 1,2,3,...,6. (Here X is a uniform discrete ran-                                    15. Acarton contains 20 computer chips, four of which are de-
dom variable.) Determine (a) Pr(X                  > 3); (b) Pr(2 < X <5);                 fective. Isaac tests these chips — one at a time and without re-
(c) Pr(X = 4|X > 3); (d) E(X); and (e) Var(X).                                             placement   — until he either finds a defective chip or has tested
  6. Acomputer dealer finds that the number of laptop comput-                              three chips. If the random variable X counts the number of
ers her dealership sells each day is a random variable X where                             chips Isaac tests, find (a) the probability distribution for X;
the probability distribution for X is given by                                             (b)    Pr(X <2);         (c)     Pr(X = 1X <2);             (d)         E(X);   and
                                                                                           (e) Var(X).
                                       cx?
                                                 x=1,2,3,4,5                               16. Suppose that X is a random variable defined on a sample
           Pr(X =x)= ¢ x!’
                                                                                           space ¥ and that a, b are constants. Show that (a) E(aX + b) =
                                      0,         otherwise,
                                                                                           aE(x) +b and (b) Var(aX + b) = a?Var(X).
where c is a constant. Determine (a)                          the        value of c;
                                                                                           17, Let X be a binomial random variable with Pr(X = x) =
(b) Pr(X >3);  (c) Pr(X =4|X >3);                              (d)        E(X);  and
                                                                                           (")p*q"-*, x =0, 1, 2,...,”, wheren (> 2) is the number of
(e) Var(X).
                                                                                           Bernoulli trials, p is the probability of success for each trial,
7, Arandom variable X has probability distribution given by                               andg = 1—p.
                                 c(6—x),           x =1,2,3,4,5                                  a) Show that E(X(X — 1)) =n? p* = np’.
         Pr(xX   =x)=
                                 0,                otherwise,                                    b) Using     the    fact       that   E(X(X — 1)) = E(X?— X)=
where c is a constant. Determine (a)                          the        value   of   c;         E(X?) — E(X) and that E(X) = np, show that Var(X) =
(b) Pr(X < 2); (c) E(X); and (d) Var(X).                                                         npq.

8. Wayne tosses an unfair coin— one that is biased so that a                              18. In alpha testing a new software package, a software engi-
head is three times as likely to occur as a tail. How many heads                           neer finds that the number of defects per 100 lines of code is a
should Wayne expect to see if he tosses the coin 100 times?                                random variable X with probability distribution:

9, Suppose      that       X   is a binomial       random          variable     where                         x            |      1     2        3           4
Pr(X =x) = (*)p?—                     py",x =0,1,2,..., 0. 1f
E(X) = 70 and Var(X) = 45.5, determine n, p.
                                                                                                          Pr(X=x)|04                    03       02          0.1
186          Chapter 3 Set Theory

Find (a) Pr(X > 1); (b)     Pr(X = 3|X > 2); (c) E(X);        and   20. An assembly comprises three electrical components that
(d) Var(X).                                                         operate independently. The probabilities that these components
19. In Mario Puzo’s novel The Gedfather, at the wedding recep-      function according to specifications are 0.95, 0.9, and 0.88. If
tion for his daughter Constanzia, Don Vito Corleone discusses       the random variable X counts the number of components that
with his godson Johnny Fontane how he will deal with the movie      function according to specifications, determine (a) the proba-
mogul Jack Woltz. And in this context he speaks the famous line     bility distribution for X; (b) Pr(X > 2|X > 1); (c) E(X); and
                                                                    (d) Var(X).
       “T’ll make him an offer he can’t refuse.”
                                                                    21. An urn contains five chips numbered 1, 2, 3, 4, and 5. When
If we let the random variable X count the number of letters         two chips are drawn (without replacement) from the urn, the
and apostrophes in a randomly selected word (from the above         random variable X records the higher value. Find E(X) andoy.
quotation) and we assume that each of the eight words has the
same probability of being selected, determine (a) the probability
distribution for X; (b) E(X); and (c) Var(X).

3.8
       Summary and Historical Review
                               In this chapter we introduced some of the fundamentals of set theory, together with certain
                               relationships to enumeration problems and probability theory.
                                  The algebra of set theory evolved during the nineteenth and early twentieth centuries.
                               In England, George Peacock (1791—1858) was a pioneer in mathematical reforms and was
                               among the first, in his Treatise on Algebra, to revolutionize the entire conception of algebra
                               and arithmetic. His ideas were further developed by Duncan Gregory (1813-1844), William
                               Rowan Hamilton (1805-1865), and Augustus DeMorgan (1806-1871), who attempted to
                               remove ambiguity from elementary algebra and cast it in the strict postulational form.
                               Not until 1854, however, when Boole published his /nvestigation of the Laws of Thought,
                               was an algebra dealing with sets and logic formalized and the work of Peacock and his
                               contemporaries extended.
                                   The presentation here is primarily concerned with finite sets. However, the investigation
                               of infinite sets and their cardinalities has occupied the minds of many mathematicians and
                               philosophers. (More about this can be found in Appendix 3. However, the reader may
                               want to learn more about functions —as presented in Chapter 5 — before looking into the
                               material in this appendix.) The intuitive approach to set theory was taken until the time of
                               the Russian-born mathematician Georg Cantor (1845-1918), who defined a set, in 1895,
                               in a way comparable to the “gut feeling” we mentioned at the start of Section 3.1. His
                               definition, however, was one of the obstacles he was never able to entirely remove from his
                               theory of sets.
                                    In the 1870s, when Cantor was researching trigonometric series and series of real num-
                               bers, he needed a device to compare the sizes of infinite sets of numbers. His treatment of
                               the infinite as an actuality, on the same level as the finite, was quite revolutionary. Some of
                               his work was rejected because it proved to be much more abstract than what many mathe-
                               maticians of his time were accustomed to. However, his work won wide enough acceptance
                               so that by    1890 the theory of sets, both finite and infinite, was considered        a branch of
                               mathematics in its own right.
                                  By the turn of the century the theory was widely accepted, but in 1901 the paradox
                               now known as Russell’s paradox (which was discussed in Exercise 27 of Section 3.1)
                               showed that set theory, as originally proposed, was internally inconsistent. The difficulty
                               seemed to be in the unrestricted way in which sets could be defined; the idea of a set’s being a
                                                        3.8 Summary and Historical Review   187

Georg Cantor (1845-1918)
                         Reproduced courtesy of The Granger Collection, New York

member of itself was considered particularly suspect. In their work Principia Mathematica,
the British mathematicians Lord Bertrand Arthur William Russell (1872-1970) and Alfred
North Whitehead (1861-1947) developed a hierarchy in the theory of sets known as the
theory of types. This axiomatic set theory, among other twentieth-century formulations,
avoided the Russell paradox. In addition to his work in mathematics, Lord Russell wrote
books dealing with philosophy, physics, and his political views. His remarkable literary
talent was recognized in 1950 when he was awarded the Nobel prize for literature.

Lord Bertrand Arthur William Russell (1872-1970)

The discovery of Russell’s paradox — even though it could be remedied   — had a pro-
found impact on the mathematical community, for many began to wonder if other contra-
dictions were still lurking. Then in 1931 the Austrian-born mathematician (and logician)
Kurt Gédel (1906-1978) formulated that “under a specified consistency condition, any
Chapter 3 Set Theory

sufficiently strong formal axiomatic system must contain a proposition such that neither it
                 nor its negation is provable and that any consistency proof for the system must use ideas and
                 methods beyond those of the system itself.” And unfortunately, from this we learn that we
                 cannot establish —-in a mathematically rigorous manner — that there are no contradictions
                 in mathematics. Yet despite “Gédel’s proof,” mathematical research continues on — in fact,
                 to the point where the amount of research since 1931 has surpassed that in any other period
                 in history.
                     The use of the set membership symbol ¢€ (a stylized form of the Greek letter epsilon)
                 was introduced in 1889 by the Italian mathematician Giuseppe Peano (1858-1932). The
                 symbol “e” ts an abbreviation for the Greek word “eo tz” meaning “is.”
                     The Venn diagrams of Section 3.2 were introduced by the English logician John Venn
                 (1834-1923) in 1881. In his book Symbolic Logic, Venn clarified ideas previously devel-
                 oped by his countryman George Boole (1815-1864). Furthermore, Venn contributed to the
                 development of probability theory — as described in the widely read textbook he wrote on
                 this subject. The Gray code, which we used in Section 3.1 to store the subsets of a finite set
                 as binary strings, was developed in the 1940s by Frank Gray at the AT&T Bell Laborato-
                 ries. Originally, such codes were used to minimize the effect of errors in the transmission
                 of digital signals.
                     If we wish to summarize the importance of the role of set theory in the development of
                 twentieth-century mathematics, the following quote attributed to the German mathematician
                 David Hilbert (1862-1943) is worth pondering: “No one shall expel us from the paradise
                 which Cantor has created for us.”
                      In Section 3.1 we mentioned the array of numbers known as Pascal’s triangle. We could
                 have introduced this array in Chapter 1 with the binomial theorem, but we waited until we
                 had some combinatorial identities that we needed to verify how the triangle is constructed.
                 The array appears in the work of the Chinese algebraist Chu Shi-kie (1303), but its first
                 appearance in Europe was not until the sixteenth century, on the title page of a book by
                 Petrus Apianus (1495-1552). Niccolo Tartaglia (1499-1559) used the triangle in computing
                 powers of (x + y). Because of his work on the properties and applications of this triangle,
                 the array has been named in honor of the French mathematician Blaise Pascal (1623-1662).
                     Although probability theory originated with games of chance and enumeration problems,
                 we included it here because set theory has evolved as the exact medium needed to state
                 and solve problems in this important contemporary area of applied mathematics. In the
                 decade following 1660, probability entered European thought as a way of understanding
                 stable frequencies in random processes. Ideas, which exemplify this consideration, were put
                 forth by Blaise Pascal, and these led to the first systematic treatise on probability, written in
                  1657 by Christian Huygens (1629-1695). In 1812 Pierre-Simon de Laplace (1749-1827)
                 collected all the ideas developed on probability theory at that time —- starting with the def-
                 inition in which each individual outcome is equally likely— and published them in his
                 Analytic Theory of Probability. Among other ideas, this text includes the Central Limit
                 Theorem —a fundamental.-result at the heart of hypothesis testing (in statistics). Along
                 with Pierre-Simon de Laplace, Thomas Bayes (1702-1761) also showed how to determine
                 probabilities by examining certain empirical data. Bayes’ Theorem honors the name of this
                 English Presbyterian minister and mathematician, Chebyshev’s Inequality (of Section 3.7)
                 is named for the Russian mathematician Pafnuty Lvovich Chebyshev (1821-1894), who
                 may be better remembered for his work in number theory and interest in mechanics. Finally,
                 the axiomatic approach to probability was first given in 1933 by the Russian mathemati-
                 cian Andrei Nikolayevich Kolmogorov (1903-1987) in his monograph Grundbegriffe der
                 Wahrscheinlichkeitsrechnung (Foundations of the Theory of Probability).
                                                                                                    Supplementary Exercises           189

More on the history and development of set theory can be found in Chapter 26 of
                            C. B. Boyer [1]. Formal developments of set theory, including results on infinite sets, can
                            be found in H. B. Enderton [3], P. R. Halmos           [4], J. M. Henle [5], and P. C. Suppes [8]. An
                            interesting history of the origins of probability and statistical ideas, up to the Newtonian
                            era, can be found in F. N. David [2]. A more contemporary coverage is given in the text
                            by V. J. Katz    [6]. Chapters    1 and 2 of J. J. Kinney      [7] are an excellent source for those
                            interested in learning more about discrete probability.

Andrei Nikolayevich Kolmogorov (1903-1987)                             Thomas Bayes (1702-1761)

REFERENCES
                                   . Boyer, Carl B. History of Mathematics. New York: Wiley, 1968.
                                =

. David, Florence Nightingale. Games,   Gods,    and Gambling. New York: Hafner,           1962.
                                WN

. Enderton, Herbert B. Elements of Set Theory. New York: Academic Press, 1977.
                                   . Halmos, Paul R. Naive Set Theory. New York: Van Nostrand,            1960.
                                     Henle, James M. An Outline of Set Theory. New York: Springer-Verlag, 1986.
                                   . Katz, Victor J. A History of Mathematics (An Introduction). New York: Harper Collins, 1993.
                                   . Kinney, John J. Probability: An Introduction with Statistical Applications. New York: Wiley,
                                     1997.
                                 8. Suppes, Patrick C. Axiomatic Set Theory. New York: Van Nostrand, 1960.

a) A-C=B-C>A=B
         SUPPLEMENTARY EXERCISES                                       b) (ANC =BNC)A(A-C=B-C)|SA=B
                                                                       ©) (AUC =BUC)A(A—-C=B-C)|S4=B
1. Let A, B, C CU. Prove that (A — B) CC        if and only if                     oo,                    ;       ;
(A—C)CB.                                                            4. a) For positive integers m, n, r, withr < min{m, n}, show
                                                   ;                   that

ER -alaeay —                                                      CPOOMALOC
2. Give a combinatorial argument to show that for integers

r

1
3. Let A, B, CCU.       Prove or disprove   (with   a counter-                    eee
example) each of the following:
190                Chapter 3 Set Theory

b) For 7 a positive integer, show that                                          rows of the table for which this is ttue—rows        }, 2, and 4, as
                                                                                      indicated by the arrows. For these rows, the columns for B and

O)-E()
                                 n          =         k
                                                                                      AU B are exactly the same, so this membership table shows
                                                                                      that ACB>AUB=B.
5. a) In how many ways can a teacher divide a group of seven
    students into two teams each containing at least one stu-                                           Table 3.7
    dent? two students?                                                                                             A|B|AUB
      b) Answer part (a) upon replacing seven with a positive
      integer n > 4.                                                                                      > |       0   0         0
                                                                                                          >         0   1         1
  6. Determine whether each of the following statements is true
                                                                                                                    j   0         1
or false. For each false statement, give a counterexample.
                                                                                                          >         \   |         1
      a) If A and B are infinite sets, then AQ B is infinite.
      b) If B is infinite and A C B, then A is infinite.                              Use membership tables to verify each of the following:
      c) If A C B with B finite, then A is finite.                                        a) AC BS>ANB=A
      d) If A C B with A finite, then B is finite.                                        b) (AN B=A)A(BUCH=C)JSBAUBUCHC
7, Aset A has 128 subsets of even cardinality. (a) How many                              ec) COBCAS(ANB)U(BNC)=ANC
subsets of A have odd cardinality? (b) What is | A|?
                                                                                          dMd@AAB=CSBAAC=BandBACH=A
8. LetA = {1, 2, 3,..., 15}.                                                         14. State the dual of each theorem in Exercise 13. (Here you
      a) How many subsets of A contain all of the odd integers                        will want to use the result of Example 3.19 in conjunction with
      in A?                                                                           Theorem 3.5.)
      b) How many         subsets      of A        contain    exactly   three   odd   15. a) Determine the number of linear arrangements of m 1’s
      integers?                                                                           and r 0’s with no adjacent        1’s. (State any needed condi-
      c) How many eight-element subsets of A contain exactly                              tion(s) for m, r.)
      three odd integers?                                                                 b) If%U = {1, 2,3,..., 2}, how many sets A C U are such
      d) Write a computer program (or develop an algorithm) to                            that |A| = k with A containing no consecutive integers?
      generate a random eight-element subset of A and have it                             [State any needed condition(s) for n, k.]
      print out how many of the eight elements are odd.                               16. If the letters in the word BOOLEAN are arranged at ran-
9. Let A, B, C CU.            Prove that                                             dom, what is the probability that the two O’s remain together in
                                                                                      the arrangement?
           (AN B)UC=AN(B
                   UC) if and only if C CA.
                                                                                      17. At a high school science fair, 34 students received awards
10. Let U be a given universe with A, B CU,                         |AN B| =3,        for scientific projects. Fourteen awards were given for projects
|A U B| = 8, and | | = 12.                                                            in biology, 13 in chemistry, and 21 in physics. If three students
      a) How many subsets CCU       satisfy ANBCCC                                    received awards in all three subject areas, how many received
      AU B? How many of these subsets C contain an even                               awards for exactly (a) one subject area? (b) two subject areas?
      number of elements?                                                             18. Fifty students, each with 75¢, visited the arcade of Example
      b) How many subsets DC UW satisfy AUBCDC                                        3.27. Seventeen of the students played each of the three com-
      A U B? How many of these subsets D contain an even                              puter games, and 37 of them played at least two of them. No
      number of elements?                                                             student played any other game at the arcade, nor did any student
11, Let% = Rand let the index set / = Q*. Foreachg € Qt,                              play a given game more than once. Each game costs 25¢ to play,
let A, = [0, 2g] and B, = (0, 3q). Determine                                          and the total proceeds from the student visit were $24.25. How
                                                                                      many of these students preferred to watch and played none of
      a)    Aq                                b)     Ay   A   By
                                                                                      the games?
      ec) UA,
            gél
                                              d) MB,gél
                                                                                      19. In how many ways can 15 laboratory assistants be assigned
                                                                                      to work on one, two, or three different experiments so that each
12. For a universe U and sets A, B CU, prove that
                                                                                      experiment has at least one person spending some time on it?
      a) A AB=BAA                             b)AAA=%U
                                                                                      20. Professor Diane gave her chemistry class a test consisting
      ec) AAU=A                                                                       of three questions. There are 21 students in her class, and ev-
      d) A        A 4= A, so         is the identity for A, as well as for U          ery student answered at least one question. Five students did
13. Consider the membership table (Table 3.7). If we are given                        not answer the first question, seven failed to answer the second
the condition that A € B, then we need consider only those                            question, and six did not answer the third question. If nine stu-
                                                                                                         Supplementary Exercises          191

dents answered all three questions, how many answered exactly          the plane to land safely, all three landing gears (the nose and
one question?                                                          both wing landing gears) must have at least one good tire. What
21. Let U be a given universe with A, B CU, ANB=4,                     is the probability that the jet will be able to land safely even on
|A| = 12, and |B| = 10. If seven elements are selected from            a hard landing?
AUB, what is the probability the selection contains four               32. Let & be the sample space for an experiment © and let
elements from A and three from B?                                      A, B be events — that is, A, B CY. Prove that Pr(A NM B) >
22. For a finite set A of integers, let o(A) denote the sum of         Pr(A)+ Pr(B)— 1. (This result is known as Bonferroni's
the elements of A. Then if Ul is a finite universe taken from          Inequality.)
Z*, Dacwyyo (A) denotes the sum of all elements of all sub-            33. The exit door at the end of a hallway is open half of the time.
sets of U. Determine L4cgmqya (A) for                                  On a table by the entrance to this hallway is a box containing 10
    a) U = {1, 2, 3}                b) U = {1, 2, 3, 4}                keys, but only one of these keys opens the exit door at the end
                                                                       of the hallway. Upon entering the hallway Marlo selects two of
    c) UW = {1, 2, 3,4, 5}          d) U={1,2,3,...,7}                 the keys from the box. What is the probability she will be able
    e) U = {a), do, a3, ..., a,}, where                                to leave the hallway via the exit door, without returning to the
    S=a+4,+4,+°-++4,                                                   box for more keys?
23. a) In chess, the king can move one position in any direc-          34, Dustin tosses a fair coin eight times. Given that his first and
    tion. Assuming that the king is moved only in a forward            last outcomes   are the same,       what is the probability he tossed
    manner (one position up, to the right, or diagonally north-        five heads and three tails?
    east), along how many different paths can a king be moved
                                                                       35. The probability Coach Sears’ basketball team wins any
    from the lower-left corner position to the upper-right corner
                                                                       given game is 0.8, regardless of any prior win or loss. If her
    position on the standard 8 X 8 chessboard?
                                                                       team plays five games, what is the probability it wins more
    b) For the paths in part (a), what is the probability that a       games than it loses?
    path contains (1) exactly two diagonal moves? (ii) exactly
                                                                       36. Suppose that the number of boxes of cereal packaged each
    two diagonal moves that are consecutive? (ili) aneven num-
                                                                       day at a certain packaging plant is a random variable — call it
    ber of diagonal moves?
                                                                       X — with E(X) = 20,000 boxes and Var(X) = 40,000 boxes’.
24, Let A, BCR,        where   A = {x|x? — 7x = —12}      and   B =    Use Chebyshev’s Inequality to find a lower bound on the prob-
{x|x? — x = 6}. Determine A U B and AN B.                              ability that the plant will package between 19,000 and 21,000
25. Let A, BCR, where A = {x|x? —7x < —12}                and B=       boxes of cereal on a particular day.
{x|x* — x < 6}. Determine A U B and AN B.                              37. Find the probability of getting one head (exactly) two times
26. Four torpedoes, whose probabilities of destroying an en-           when three fair coins are tossed four times.
emy ship are 0.75, 0.80, 0.85, and 0.90, are fired at such a           38. Devon has a bag containing 22 poker chips       — eight red,
vessel. Assuming the torpedoes operate independently, what is          eight white, and six blue. Aileen reaches in and withdraws
the probability the enemy ship is destroyed?                           three of the chips, without replacement. Find the probability
27. Travis tosses a fair coin twice. Then he tosses a biased coin,     that Aileen has selected (a) no blue chips; (b) one chip of each
one where the probability of a head is 3/4, four times. What is        color; or (c) at least two red chips.
the probability Travis’s six tosses result in five heads and one       39, Let X be a random variable with probability distribution
tail?
28. Let ¥ be the sample space for an experiment ©, with events                                   c(x2 +4),          x =0,1,2,3,4
                                                                               Pr(X =x) =
A, B CY. Prove that                                                                                  ;              otherwise,
                         Pr(A)+ Pr(B)-1
               Pr(A|B) =       Pr(B)                                   where c is a constant. Determine (a)                 the    value of c;
                                                                       (b) Pr(X > 1); (c) Pr(X =3|X > 2);                    (d)    E(X);  and
29. Let A, B, C be independent events taken from a sample              (e) Var(X).
space *. Prove that the events A and B U C are independent.            40. Adozen urns each contain four red marbles and seven green
30. What is the minimum number of times we must toss a fair            ones. (All 132 marbles are of the same size.) If a dozen students
coin so that the probability that we get at least two heads is at      each select a different urn and then draw (with replacement)
least 0.95?                                                            five marbles, what is the probability that at least one student
31. Alarge jet aircraft has two wheels per landing gear for added      draws at least one red marble?
safety. The tires are rated so that even with a “hard landing” the     41. Maureen draws five cards from a standard deck: the 6 of di-
probability of any single tire blowing out is only 0.10. (a) What      amonds, 7 of diamonds, 8 of diamonds, jack of hearts, and king
is the probability that a landing gear (with two tires) will survive   of spades. She discards the jack and king and then draws two
even a hard landing with at least one good tire? (b) In order for      cards from the remaining 47. What is the probability Maureen
192           Chapter 3 Set Theory

finishes with (a) a straight flush; (b) a flush (but not a straight   44. A fair die is rolled three times and the random variable X
flush); and (c) a Straight (but not a straight flush)?                records the number of different outcomes that result. For exam-
42. Inthe game of pinochie the deck consists of 48 cards — two        ple, if two 5’s and one 4 are rolled, then X records two differ-
each of the 9, 10, jack, queen, king, and ace for each of the four    ent outcomes, Determine (a) the probability distribution for X,
suits. There are four players and each is dealt 12 cards. What is     (b) E(X); and (c) Var(X).
the probability a given player is dealt four kings (one of each       45,   When   a coin is tossed three times, for the outcome   HHT
suit), four queens (one of each suit), and four other cards none      we say that two runs have occurred — namely, HH and T. Like-
of which is a king or queen? (Such a hand is referred to as a         wise, for the outcome THT we find three runs: T, H, and T.
bare roundhouse.)                                                     (The notion of a run was first introduced in Example 1.41.)
43. A grab bag contains one chip with the number 1, two chips         Now suppose a biased coin, with Pr(H) = 3/4, is tossed three
each with the number 2, three chips each with the number              times and the random variable X counts the number of runs
3,..., and » chips each with the number n, where n € Z*.              that result. Determine (a) the probability distribution for X;
All chips are of the same size, those numbered | to m are red,        (b) E(X); and (c) ox.
and those numbered m+ 1! to ” are blue, where m € Z* and
m <n. If Casey draws one chip, what is the probability it is the
chip with 1 on it, given that the chip is red?
Properties of
the Integers:
Mathematical
  Induction

He: known about the integers since our first encounters with arithmetic, in this chapter
                   we examine a special property exhibited by the subset of positive integers. This property
              will enable us to establish certain mathematical formulas and theorems by using a technique
              called mathematical induction. This method of proof will play a key role in many of the
              results we shall obtain in the later chapters of this text. Furthermore, this chapter will provide
              us with an introduction to five sets of numbers that are very important in the study of discrete
              mathematics and combinatorics — namely, the triangular numbers, the harmonic numbers,
              the Fibonacci numbers, the Lucas numbers, and the Eulerian numbers.
                  When   x, y € Z, we know     that x + y, xy, x — y € Z. Thus we say that the set Z is
              closed under (the binary operations of) addition, multiplication, and subtraction. Turning
              to division, however, we find, for example, that 2, 3 € Z but that the rational number 4 is
              not a member of Z. So the set Z of all integers is not closed under the binary operation
              of nonzero division. To cope with this situation, we shall introduce a somewhat restricted
              form of division for Z and shall concentrate on special elements of Z* called primes. These
              primes turn out to be the “building blocks” of the integers, and they provide our first example
              of a representation theorem — in this case the Fundamental Theorem of Arithmetic.

4}
The Well-Ordering Principle:
   Mathematical Induction
              Given any two distinct integers x, y, we know that we must have either x < y or y < x.
              However, this is also true if, instead of being integers, x and y are rational numbers or real
              numbers. What makes Z special in this situation?
                 Suppose we try to express the subset Z* of Z, using the inequality symbols > and >.
              We find that we can define the set of positive elements of Z as

Zt = {x €Z|x > 0} = {x €Z|x > 1}.

193
194      Chapter 4 Properties of the Integers: Mathematical Induction

When we try to do likewise for the rational and real numbers, however, we find that

Qt = {x €Q\x > 0}         and      Rt = {x Ee R|x > 0},

but we cannot represent Q* or Rt using > as we did for Z*.
                             The set Z* is different from the sets Q* and R* in that every nonempty subset X of
                           Z*    contains an integer a such that a < x, for all x €¢ X —that    is, X contains a least (or
                          smallest) element. This is not so for either Qt or Rt. The sets themselves do not contain least
                          elements. There is no smallest positive rational number or smallest positive real number. If
                          q is a positive rational number, then since 0 < g/2 <q, we would have the smaller positive
                          rational number gq /2.
                              These observations lead us to the following property of the set Z* C Z.

The Well-Ordering Principle: Every nonempty subset of Z* contains a smallest
                             element. (We often express this by saying that Z* is well ordered.)

This principle serves to distinguish Z* from Q* and R*. But does it lead anywhere that
                           is mathematically interesting or useful? The answer is a resounding “Yes!” It is the basis
                           of a proof technique known as mathematical induction. This technique will often help us to
                           prove a general mathematical statement involving positive integers when certain instances
                           of that statement suggest a general pattern.
                                We now establish the basis for this induction technique.

THEOREM 4.1                The Principle of Mathematical Induction. Let S(n) denote an open mathematical statement
                           (or set of such open statements) that involves one or more occurrences of the variable n,
                           which represents a positive integer.

a) If S(1) is true; and
                                b) If whenever S(k) is true (for some particular, but arbitrarily chosen, k €¢ Z*), then
                                   S(k + 1) is true;
                           then S(n) is true for alln € Z*.
                           Proof: Let S(n) be such an open statement satisfying conditions (a) and (b), and let F =
                           {t € Z*|S(t) is false}. We wish to prove that F = @, so to obtain a contradiction we assume
                           that F # 9. Then by the Well-Ordering Principle, F has a least element m. Since S(1)
                           is true, it follows that m # 1, so m > 1, and consequently m — 1 ¢ Zt. Withm —1¢ F,
                           we have S(m — 1) true. So by condition (b) it follows that S((# — 1) + 1) = S(m) 1s true,
                           contradicting m € F. This contradiction arose from the assumption that Ff # @. Conse-
                           quently, F = @.

We have now seen how the Well-Ordering Principle is used in the proof of the Principle of
                           Mathematical Induction. It is also true that the Principle of Mathematical Induction is useful
                           if one wants to prove the Well-Ordering Principle. However, we shall not concern ourselves
                           with that fact right now. In this section our major goal will center on understanding and
                           using the Principle of Mathematical Induction. (But in the exercises for Section 4.2 we shall
                           examine how the Principle of Mathematical Induction is used to prove the Well-Ordering
                           Principle.)
                                4.1 The Well-Ordering Principle: Mathematical Induction    195

In the statement of Theorem 4.1 the condition in part (a) is referred to as the basis step,
while that in part (b) is called the inductive step.
    The choice of 1 in the first condition of Theorem 4.1 is not mandatory. All that is needed
is for the open statement S(7) to be true for some first element ng € Z so that the induction
process has a starting place. We need the truth of S(1o) for our basis step. The integer no
could be 5 just as well as 1. It could even be zero or negative because the set Z* in union
with {0} or any finite set of negative integers is well ordered. (When we do an induction
proof and start with mp < 0, we are considering the set of all consecutive negative integers
> no in union with {0} and Z*.)
   Under these circumstances, we may express the Principle of Mathematical Induction,
using quantifiers, as

[S(no) A [Wk > no [S(K) => SK + D>                   Vn 2 no S(n).
   We may get a somewhat better understanding of why this method of proof is valid by
using our intuition in conjunction with the situation presented in Fig. 4.1.


                         Ng    Ng + 1       Ng + 2       Ng
                                                          +3
                                                         —_
                                            |

k       k+1
               (b)

Ng    No   +   1   No   +   2   No   +   3

(c)
            Figure 4.1

In part (a) of the figure we see the first four of an infinite (ordered) arrangement of
dominos, each standing on end. The spacing between any two consecutive dominos is
always the same, and it is such that if any one domino (say the kth) is pushed over to
the right, then it will knock over the next ({k + 1)st) domino. This process is suggested
in Fig. 4.1(b). Our intuition leads us to feel that this process will continue, the (k + 1)st
domino toppling and knocking over (to the right) the (kK + 2)nd domino, and so on. Part (c)
of the figure indicates how the truth of S(o) provides the push (to the right) to the first
domino (at v9). This provides the basis step and sets the process in motion. The truth of S(k)
196        Chapter 4 Properties of the Integers: Mathematical Induction

forcing the truth of S(k + 1) gives us the inductive step and continues the toppling process.
                            We    then infer the fact that S(m)       is true for all n > no as we imagine ail the successive
                            dominos toppling (to the right.)

We shall now demonstrate several results that call for the use of Theorem 4.1.

Forall     Zt, S77,
                                            ne        §=142434---4n=                       oe
      EXAMPLE 4.1
                            Proof: Forn = | the open statement
                                                                                                    n(n+   1)
                                                     Sm):      SCi=142434---40=                        2
                                                               i=l

becomes S(1): $o}_, i = 1 = (1) 4 1)/2. So S(1) is true and we have our basis step —
                            and a starting point from which to begin the induction. Assuming the result true for n = k
                            (for some k € Z*), we want to establish our inductive step by showing how the truth of
                            S(k) “forces” us to accept the truth of S(k + 1). [The assumption of the truth of S(k) is our
                             induction hypothesis.\ To establish the truth of S(k + 1), we need to show that

Si - ———.
                                                                          (k + 1)(k +2)
                                                                     i=l

We proceed as follows.
                                  k+l                                              k
                                                                                                       k(k+1
                                  Soi    =142+4+---4+k4+(k+1)=                 (>>.        +({kK+1)=   Oe         ED,
                                  i=]                                            i=l
                            for we are assuming the truth of $(k). But

MEAD
                                             k+] 5 Gy 1 = MAAD 1                       2    ED] _      DES?)
                                                                                                    k+1)(kK4+2

establishing the inductive step [condition (b)] of the theorem.
                               Consequently, by the Principle of Mathematical Induction, S(7) is true for all n € Z*.

Now that we have obtained the summation formula for }*"_, i in two ways (see Ex-
                             ample 1.40), we shall digress from our main topic and consider two examples that use this
                             summation formula.

A wheel of fortune has the numbers from 1 to 36 painted on it in a random manner. Show
      EXAMPLE 4.2
                            that regardless of how the numbers are situated, there are three consecutive (on the wheel)
                            numbers whose total is 55 or more.
                                Let x; be any number on the wheel. Counting clockwise from x, label the other numbers
                            X2, X3,..., X36. For the result to be false, we must have x; + x2 + x3 < 55, x. 4+ x3 4X4       <
                            55,2... X34 +35 + x36 < 55, x35 + x36 +X) < 55, and x36 + x) + x2 < 55. In these 36
                            inequalities, each of the terms x), %2, ..., X36 appears (exactly) three times, so each of the
                            integers 1, 2, ... , 36 appears (exactly) three times. Adding all 36 inequalities, we find that
                            3 S096, x, = 3 978, i < 36(55) = 1980. But 5°26, i = (36)(37)/2 = 666, and this gives
                            us the contradiction that 1998 = 3(666) < 1980.

Among the 900 three-digit integers (from 100 to 999) those such as 131, 222, 303, 717,
      EXAMPLE 4.3
                            848, and 969, where the integer is the same whether it is read from left to right or from
                                                      4.1 The Well-Ordering Principle: Mathematical Induction                              197

right to left, are called palindromes. Without actually determining all of these three-digit
              palindromes, we would like to determine their sum.
                  The typical palindrome under study here has the form aba = 100a + 10b+a =
              10la + 10b,   where   1<a<9                 and    0<b   <9.         With   nine         choices   for a   and       ten   for b,
              it follows from the rule of product that there are 90 such three-digit palindromes. Their
              sum is

y (> ch) = s y aba = » Sota                                               + 10b)
                                       a=1          b=0             a=1 b=0
                                          9                                   9                    9                           9
                                     = >> ote                       + yr                  =        > | oct          + oye
                                                                             b=(               a=]                         b=]
                                          9                                                    9
                                                             10(9- 10)
                                     =)             [10104 + “|-
                                                               —                              d| (1010a + 450)
                                       a=

pas

1010 s a + 9(450)
                                                      a=]

1010(9 - 10
                                     - —           + 4050 = 49,500.

The next summation formula takes us from first powers to squares.

Prove that for each n € Z*,
EXAMPLE 4.4
                                                          Le     _ a(n+ a                 + I

Proof: Here we are dealing with the open statement

Sin):             yi 2     n(n + —                   + Ly

Basis Step: We start with the statement S(1) and find that

Se _ p—!d+dD@0)
                                                             : +) .
                                                i=]

so S(1) is true.
                 Inductive Step: Now we assume the truth of S(k), for some (particular) k ¢ Z* —that
              is, we assume that
                                                            k
                                                                a. Mkt Dek +1)
                                                          i=

is a true statement (when n is replaced by k). From this assumption we want to deduce the
              truth of

(K+ IK + D+ beey n> 1)
                                                    k+1
                              Skt):                 P=

+ 3)
                                                                _ (k+ Ik +P 2)(2k    .
198        Chapter 4 Properties of the Integers: Mathematical Induction

Using the induction hypothesis S(k), we find that
                                               k+1                                                                                        k

SOP HP HP ee 4P$ ker = VP 4k +1"
                                               t=]
                                                            _ —             pers               2] LED?                                   i=l

2k            +1                                                            2k? +               7k       +6
                                                            = e+ [EY                                  een] a4                                                RAAF
                                                            _ ke + IK +2)Q2k +3)
                                                                                 6                         3

and the general result follows by the Principle of Mathematical Induction.

The formulas from Examples 4.1 and 4.4 prove handy in deriving our next result.

Figure 4.2 provides the first four entries of the sequence of triangular numbers. We see
      EXAMPLE 4.5
                            that     tf; =    1,6     =3,     tz = 6, t, =               10,   and,   in           general,               tp =142+--.-+i7=                                      iG     +   1)/2,
                             for eachi € Z*. Fora fixed n € Z* we want a formula for the sum of the first n triangular
                            numbers —that                  is, 4) +f +---+t% = )\7_, tj. When n =2 we have t; + fy = 4. For
                            n = 3 the sum is 10. Considering » fixed (but arbitrary) we find that

+1)                 Toy                   .                loan                     1S,
                                        nt             n

vot             ~sS                   =ZLW@                   roadie                                     +s de
                                                l|

i=1            i=]

1[atnt+DQn41)                                 1 fntn+1)                                               4          | 2a+1                    1
                                                    =_/{5       r                                   + _~|;—~——
                                                                                                      5     5                                  |= n(n+            1 1)            D            + _4

— a(n t+ 1)(n + 2)
                                                                r                    ;

Consequently, if we wish to know the sum of the first 100 triangular numbers, we have
                                                                                                       100(101)(102
                                                              hte hg = —                                                    a                    = 171,700,

e

e                                                e            e

e                                      e            e                                     e          e            e

e                   e        e                        e                 s            e                   e               e            6         e

t= 1               tp=14+2                               tz=14+243                                      tp=1+24+3+4
                                               _1+2                _3_2:3                               _-_3:4                                         ay 425
                                                2                   “355                                           em           "5                                       a)
                                    Figure 4.2

Before we present any more results, let us note how we started the proofs in Examples 4.1
                            and 4.4. In both cases we simply replaced the variable n by 1 and verified the truth of some
                            rather easy equalities. Considering how the inductive step in each of these proofs was
                                                                    4.1 The Well-Ordering Principle: Mathematical Induction    199

definitely more complicated to establish, we might question the need for bothering with
                these basis steps. So let us examine the following example.

;           a.                       ;
                         + establish the validity of the open statemen
EXAMPLE   4.6   If n € Z,
                                                            it
                                                                                                  _ne+n+2
                                        Sin):           DOi=14+2434---40                              5            .
                                                        i=]

This time we shall go directly to the inductive step. Assuming the truth of the statement

k                    ke2 +k4+2
                                        S(k):               >        b=142434---+k = —— —
                                                         i=]

for some (particular) k € Z*, we want to infer the truth of the statement
                                        k+1                                                             ;
                                                        142434--4ke+                       4D =5 k+1)°) +(kK4   ) 2
                                                                                                           : +1)4+
                      S(k + 1):         Sia
                                        TT              243k 44
                                                                     5

As we did previously, we use the induction hypothesis and calculate as follows:
                                  k+l                                                                 k
                                  Pisteaaesneaey=(Li)                                                        easy
                                  i=l               2                                                i=1
                                           ke+k+2
                                         = —— 2   +k +1)
                                         _ RetK+2 | 2kK+2 _ k?+3k+4
                                              2        2        20°
                    Hence, for each k € Z*, it follows that S(k) > S(k + 1). But before we decide to accept
                the statement Vn S(n) as a true statement, let us reconsider Example 4.1. From that example
                we learned that )°;_, i = n(n + 1)/2, forall                          € Z*. Therefore, we can use these two results
                (from Example 4.1 and the one already “established” here) to conclude that for all n € Zt,

nnatl)
                                                            ——                = d!
                                                                                   CQ,=
                                                                                      vnr+n42
                                                                                i=

which implies that n(n + 1) = n? +n +2 and 0 = 2. (Something is wrong somewhere!)
                   Ifn = 1,then )°!_, 1 = 1, but (n? +n + 2)/2 = (14+ 1+42)/2 = 2. So S(1) is nottrue.
                But we may feel that this result just indicates that we have the wrong starting point. Perhaps
                S(n) is true for all n > 7, or all                       > 137. Using the preceding argument, however, we know
                that for any starting point np € Z*, if S(mo) were true, then
                                                2                             no
                                             No + No +2
                                             OO         = SPH; 142434---
                                                                      4200.
                                                        2                    i=l

From the result in Example 4.1 we have ye i = No(no + 1)/2, so it follows once again
                that 0 = 2, and we have no possible starting point.
                    This example should indicate to the reader the need to establish the basis step —no
                matter how easy it may be to verify it.
200        Chapter 4 Properties of the Integers: Mathematical Induction

Now consider the following pseudocode procedures. The procedure in Fig. 4.3 uses a for
                            loop to accumulate the sum of the squares. The second procedure (Fig. 4.4) demonstrates
                            how the result of Example 4.4 can be used in place of such a loop. In both procedures the input
                            is a positive integer n and the output is   an i. However, whereas the pseudocode within
                            the for loop of the procedure in Fig. 4.3 entails a total of n additions and n multiplications
                            (not to mention the n — 1 additions for incrementing the counter variable 7), the procedure
                            in Fig. 4.4 requires only two additions, three multiplications, and one (integer) division.
                            And this total number of additions, multiplications, and (integer) divisions is still 6 as the
                            value of n increases. Consequently, the procedure in Fig. 4.4 is considered more efficient.
                            (This idea of a more efficient procedure will be examined further in Sections 5.7 and 5.8.)

procedure            SumOfSquares1      (n: positive   integer)
                                                begin
                                                   sum      :=0
                                                   for i :=1tondo
                                                          sum    := sum+ i*
                                                end

Figure 4.3

procedure SumOfSquares2 (n: positive                   integer)
                                                begin
                                                   sum:=n* (n+1)* (2*nm+1)/6
                                                end

Figure 4.4

Looking back at our first two applications of mathematical induction (in Examples 4.1
                             and 4.4), we might wonder whether this principle applies only to the verification of known
                             summation formulas. The next seven examples show that mathematical induction is a vital
                             tool in many other circumstances as well.

Let us consider the sums of consecutive odd positive integers.
      EXAMPLE 4.7
                                 1) 1                           =]         (= 17)
                                 2)143                          =4         (= 27)
                                 3) 1+345                       =9         (= 37)
                                 4)14+3+54+7                    = 16       (= 4°)
                                 From these first four cases we conjecture the following result: The sum of the first n
                             consecutive odd positive integers is n*: that is, for alln € ZT,
                                                                                    n

S(n):   S-(2i    —l)=n’.
                                                                                i=]

Now    that we have developed           what we feel is a true summation   formula, we use the
                             Principle of Mathematical Induction to verify its truth for all n > 1.
                                                  4.1 The Well-Ordering Principle: Mathematical Induction         201

From the preceding calculations, we see that $(1) is true [as are $(2), §(3), and S(4)],
              and so we have our basis step. For the inductive step we assume the truth of $(k) for some
              k (> 1) and have

k
                                                        > (i —1)=k.
                                                        i=]

We now deduce the truth of S(k + 1): eas; (2i — 1) = (k + 1)*. Since we have assumed
              the truth of S(k), our induction hypothesis, we may now write
                          k+]                k
                         S(Qi-D = CQ -1) + 2K 4D -N RP +241) -1
                          i=]               i=]

=k? +2k4+1=(k+1)’.
                 Consequently, the result S(m) is true for all n > 1, by the Principle of Mathematical
              Induction,

Now it is time to investigate some results that are not summation formulas.

In Table 4.1, we have listed in adjacent columns the values of 4           and n> — 7   for the positive
EXAMPLE 4.8
              integers n, where | <n < 8. From the table, we see that (n* — 7) < 4n forn = 1, 2, 3, 4,5;
              but when n = 6, 7, 8, we have 4n < (n* — 7). These last three observations lead us to
              conjecture: For all n > 6, 4n < (n* — 7).

Table 4.1

n           4n     n—-7Tin             4n      n?—7
                                      1            4          —6    5        20         18
                                      2            8          —3    6        24         29
                                      3           12           2    7        28         42
                                      4           16           9    8        32         57

Once again, the Principle of Mathematical Induction is the proof technique we need to
              verify our conjecture. Let $() denote the open statement: 42 < (n? — 7). Then Table 4.1
              confirms that $(6) is true [as are S(7) and S(8)], and we have our basis step. (At last we
              have an example wherein the starting point is an integer np # 1.)
                 In this example, the induction hypothesis is S(k): 4k < (k* — 7), where k € Z* and
              k > 6. In order to establish the inductive step, we need to obtain the truth of S(k + 1) from
              that of S(k). That is, from 4k < (k? — 7) we must conclude that 4(k + 1) < [(k + 1)* — 7].
              Here are the necessary steps:

4k < (k* —7) 3 4k 4+4 < (k* —7)4+4< (RP —7) 4+ (Qk 4-1)
              (because for
                         k > 6, we find 2k + 1 > 13 > 4), and

Ak +4 < (k* —7) + (2k +1) 3 4K +1) < (kK? +2k +1) -—7 = (k 41)?                            7.
                 Therefore, by the Principle of Mathematical Induction, ${7) is true for all n > 6.
202         Chapter 4 Properties of the Integers: Mathematical Induction

Among the many interesting sequences of numbers encountered in discrete mathematics
      EXAMPLE 4.9
                             and combinatorics, one finds the harmonic numbers H,, Hz, H3,..., where
                                                                           H,=1

Ay =1+
                                                                                              2

H3=1+4+ ! + Y
                                                                            °                 2     3
                                                                                °   9

and, in general, H, = 1 + , + ; fee.                        1 for each n eZ.
                                The following property of the harmonic numbers provides one more opportunity for us
                             to apply the Principle of Mathematical Induction.

Foralln € Z*, )° Hy = (1 +1), ~ 1.
                                                                                        j=i

Proof:   As we have done in the earlier examples                     (that is, Examples 4.1, 4.4, and 4.7), we
                              verify the basis step atn = 1 for the open statement S(): Vie                     H; = (n+ 1)H, —n. This
                              result follows readily from

So Hj) =H =1=2-1-1=(4+DA-1.
                                                       j=!

To verify the inductive step, we assume the truth of S(k), that is,
                                                                      k
                                                                    S2 A; = (k+1)A, —k.
                                                                    j=l
                              This assumption then leads us to the following:
                                         k+l      k
                                         So A; = >> A; + Aya = [K+ DAR - I+ Aes
                                         yet         ys                    = (k+ 1H —k + Hess
                                                                           = (k+ Dl Hest — A/(k + D)) — e+ Aes
                                                                           = (k+2)Hpy1-1—k
                                                                           = (k +2) Hei — (K+ 0):
                              Consequently, we now know from the Principle of Mathematical Induction that $(m) is true
                              for all positive integers n.

For all n > 0 let A,, C R, where |A,,| = 2” and the elements of A,, are listed in ascending
      EXAMPLE 4.10
                              order. If r € R, prove that in order to determine whether r € A,, (by the procedure developed
                              below), we must compare r with no more than n + 1 elements in A,.
                                 When n = 0, Ao = {a} and only one comparison is needed. So the result is true for
                              n = 0 (and we have our basis step). For n = 1, A; = {a,, a2} with a; < a2. In order to
                              determine whether r € A;, at most two comparisons must be made. Hence the result follows
                              when n = 1. Now if n = 2, we write Ar = {b;, bo, c), co} = By UC), where b; < by <
                              C1} < Co, By = {b), b2}, and C; = {c;, co}. Comparing r with 62, we determine which of
                              the two possibilities— (1) r € By; or (ii) r € C; —can occur. Since |B,| = |C;| = 2, either
                              one of the two possibilities requires at most two more comparisons (from the prior case
                                                 4.1 The Well-Ordering Principle: Mathematical Induction   203

where 1 = 1). Consequently, we can determine whether r € Az by making no more than
               2+ 1=n+1 comparisons.
                  We now argue in general. Assume the result true for some & > 0 and consider the case for
               Aga,     where   |Az41| = 2‘+!. In order to establish our inductive step, let Ag4; = By U Cx,
               where |B; | = |C,| = 2*, and the elements of B,, C; are in ascending order with the largest
               element x in B; smaller than the least element in C;,. Let r €¢ R. To determine whether
               r € Ag4i, we consider whether r € By or r € Cy.

a) First we compare r and x. (One comparison)
                 b) If r <x, then because      | B;| = 2*, it follows by the induction hypothesis that we can
                       determine whether r € B, by making no more than k + 1 additional comparisons.
                 c) If r > x, we do likewise with the elements in C;. We make at most & + 1 additional
                    comparisons to see whether r € Cx.

In any event, at most (k + 1) + 1 comparisons are made.
                  The general result now follows by the Principle of Mathematical Induction.

One of our first concerns when we evaluate the quality of a computer program is whether
EXAMPLE 4.11
               the program does what it is supposed to do. Just as we cannot prove a theorem by checking
               specific cases, so we cannot establish the correctness of a program simply by testing various
               sets of data. (Furthermore, doing this would be quite difficult if our program were to become
               a part of a larger software package wherein, perhaps, a data set is internally generated.) Since
               software development places a great deal of emphasis on structured programming, this has
               brought about the need for program verification. Here the programmer or the programming
               team must prove that the program being developed is correct regardless of the data set
               supplied. The effort invested at this stage considerably reduces the time that must be spent
               in debugging the program (or software package). One of the methods that can play a major
               role in such program verification is mathematical induction. Let us see how.
                   The pseudocode program segment shown in Fig. 4.5 is supposed to produce the answer
               x(y") for real variables x, y with nm a nonnegative integer. (The values for these three
               variables are assigned earlier in the program.) We shall verify the correctness of this program
               segment by mathematical induction for the open statement.

S(n):     For all x, y € R, if the program reaches the top of the while loop with n € N, after
               the loop is bypassed (for n = 0) or the two loop instructions are executed n (> 0) times,
               then the value of the real variable answer is x(y").

while
                                                           n #          0 do
                                                           begin
                                                              X:=x*y
                                                              n:=n-1
                                                           end
                                                        answer     :=   xX

Figure 4.5

The flowchart for this program segment is shown in Fig. 4.6. Referring to it will help us
               as we develop our proof.
204   Chapter 4 Properties of the Integers: Mathematical Induction

Initialize the
                                                real variables
                                                x, yand the
                                             nonnegative
                                          integer variable n

”                                              The top of
                                                                                                 the while loop

answer := x
                                                 Xi=xX*y                                       The program continues
                                                 n=n-                                         with the next executable
                                                                                       a       statement following the
                                                                                              assignment statement for
                                                                                               the real variable answer.
                                   Figure 4.6

First consider $(0), the statement for the case where n = 0. Here the program reaches the
                       top of the while loop, but since n = 0, it follows the No branch in the flowchart and assigns
                       the value x = x(1) = x(y°) to the real variable answer. Consequently, the statement $(0)
                       is true and the basis step of our induction argument is established.
                           Now we assume the truth of $(k), for some nonnegative integer k. This provides us with
                       the induction hypothesis.

S(k):     For all x, y € R, if the program reaches the top of the while loop with k € N, after
                       the loop is bypassed (for & = 0) or the two loop instructions are executed k (> Q) times,
                       then the value of the real variable answer is x(y*).
                           Continuing with the inductive step of the proof, when dealing with the statement
                       S(k + 1), we note that because k +1> 1, the program will not simply follow the No
                       branch and bypass the instructions in the while loop. Those two instructions (in the while
                       loop) will be executed at least once. When the program reaches the top of the while loop for
                       the first time,    = k + 1 > 0, so the loop instructions are executed and the program returns
                       to the top of the while loop where now we find that

e The value of y is unchanged.
                           e The value of x is x; = x(y!) = xy.
                           ® The value
                                    of nis (kK +1)—-—1=k.

But now, by our induction hypothesis (applied to the real numbers                  x, y), we know   that
                       after the while loop for x;, y andn = k is bypassed (for k = 0) or the two loop instructions
                       are executed & (> 0) times, then the value assigned to the real variable answer is

xi(y*) = (ry) (y*) = x0").
                           So by the Principle of Mathematical Induction, S(n) is true for all 7 > 0 and the correct-
                       ness of the program segment is established.
                                                    4.1 The Well-Ordering Principle: Mathematical Induction       205

Recall (from Examples 1.37 and 3.11) that for a given n € Z*, a composition of n is an
EXAMPLE 4.12                      ns                                        .
               ordered sum of positive-integer summands summing to n. In Fig. 4.7 we find the compo-
               sitions of 1, 2, 3, and 4. We see that

a) 1 has 1 = 2° = 2'~! composition, 2 has 2 = 2! = 2?-! compositions, 3 has 4 = 2? =
                    2?! compositions, and 4 has 8 = 23 = 24~! compositions; and
                 b) the eight compositions of 4 arise from the four compositions of 3 in two ways:
                    (i) Compositions (1’)—(4’) result by increasing the last summand (in each correspond-
                    ing composition of 3) by 1; (ii) Each of compositions (1”)—(4”) is obtained by ap-
                    pending “+1” to the corresponding composition of 3.

(n=1)       1                         (n=4)       (1!)     4
                                                                                 (2)      143
                               (n=2)       2                                     (3)      242
                                           1+1                                   (4)      14142

(n=3)       (1)        3                          1”)      341
                                           (2)        142                        2”)      142+1
                                          (3)        24+1                        3”)      24141
                                           (4)       14+1+41                     (4)      1414141
                             Figure 4.7

The observations in part (a) suggest that for all n € Z*, S(n): n has 2"! compositions.
               The result [in part (a)] for n = 1 provides our basis step, 5(1). So now let us assume the
               result true for some (fixed) k € Z* — namely, S(k): k has 2*—' compositions. At this point
               consider S(k + 1). One can develop the compositions of k + 1 from those of & as in part
               (b) above (where k = 3). For k > 1, we find that the compositions of k + 1 fall into two
               distinct cases:

1) The compositions          of k + 1, where the last summand         is an integer ¢ > 1: Here this
                     last summand ¢ is replaced by t — 1, and this type of replacement provides a corre-
                     spondence between all of the compositions of k and all those compositions of k + 1,
                     where the last summand exceeds 1.
                  2) The compositions of k + 1, where the last summand                   is 1: In this case we delete
                     “+1” from the right side of this type of composition of k + 1. Once again we get
                     a correspondence between all the compositions of k and all those compositions of
                     k + 1, where the last summand is 1.
                         Therefore, the number of compositions of k + 1 is twice the number for k. Conse-
                     quently, it follows from the induction hypothesis that the number of compositions of
                     k +1 is 2(2*~!) = 2*. The Principle of Mathematical Induction now tells us that for
                     all n € Z*, S(n):n has 2"~! compositions (as we learned earlier in Examples 1.37
                     and 3.11).

EXAMPLE 4.13   We learn from the equation 14 = 3 + 3 + 8 that we can express 14 using only 3’s and 8’s
               as summands. But what may prove to be surprising is that for all n > 14,

S(n):       n-can be written as a sum of 3’s and/or 8’s (with no regard to order).
206      Chapter 4 Properties of the Integers: Mathematical Induction

As we start to verify S(v) for all n > 14, we realize that the given introductory sentence
                          shows us that the basis step $(14) is true. For the inductive step we assume the truth of
                          S(k) for some k € Z*, where k > 14, and then consider what can happen for S(k + 1). If
                          there is at least one 8 in the sum (of 3’s and/or 8’s) that equals k, then we can replace this 8
                          by three 3’s and obtain k + 1 as a sum of 3’s and/or 8’s. But suppose that no 8 appears as a
                           summand of k. Then the only summand used is a 3, and, since k > 14, we must have at least
                           five 3’s as summands. And now if we replace five of these 3’s by two 8’s, we obtain the
                           sum k + 1, where the only summands are 3’s and/or 8’s. Consequently, we have shown how
                           S(k) => S(k + 1) and so the result follows for all n > 14 by the Principle of Mathematical
                           Induction.

Now that we have seen several applications of the Principle of Mathematical Induction,
                           we Shall close this section by introducing another form of mathematical induction. This sec-
                           ond form is sometimes referred to as the Alternative Form of the Principle of Mathematical
                           Induction or the Principle of Strong Mathematical Induction.
                              Once again we shall consider a statement of the form Wn > no S(n), where ny € Z*, and
                           we shall establish both a basis step and an inductive step. However, this time the basis step
                           may require proving more than just the first case — where n = ng. And in the inductive step
                           we shall assume the truth of all the statements S(79), S(mgp + 1), ..., S(K — 1), and S{(k),
                           in order to establish the truth of the statement S(k + 1). We formally present this second
                           Principle of Mathematical Induction in the following theorem.

THEOREM 4.2                The Principle of Mathematical Induction —Alternative Form. Let S(n) denote an open
                           mathematical     statement    (or set of such open    statements)   that involves   one or more oc-
                           currences of the variable n, which represents a positive integer. Also let ng, ny € Z* with
                           No SHY.

a) If S(mo), S(to + 1), Smo        + 2), ..., SC)    — 1), and S(n,) are true; and
                             b) If whenever S(no), S(#o + 1), ..., S(K — 1), and S(k) are true for some (particular
                                 but arbitrarily chosen) k € Z+, where k > m1, then the statement $(k + 1) is also true;

then S(n) is true for all n > ng.

As in Theorem 4.1, condition (a) is called the basis step and condition (b) is called the
                           inductive step.
                               The proof of Theorem 4.2 is similar to that of Theorem 4.1 and will be requested in the
                           Section Exercises. We shall also learn in the exercises for Section 4.2 that the two forms
                           of mathematical induction (given in Theorems 4.1 and 4.2) are equivalent, for each can be
                           shown to be a valid proof technique when we assume the truth of the other.
                               Before we give any examples where Theorem 4.2 is applied, let us mention, as we did
                           for Theorem 4.1, that 79 need not actually be a positive integer — it may, in reality, be 0 or
                           even possibly a negative integer. And now that we have taken care of that point once again,
                           let us see how we might apply this new proof technique.
                               Our first example should be familiar. We shall simply apply Theorem 4.2 in order to
                           obtain the result in Example 4.13 in a second way.
                                                     4.1 The Well-Ordering Principle: Mathematical Induction           207

The following calculations indicate that it is possible to write (without regard to order) the
    EXAMPLE 4.14   integers 14, 15, 16 using only 3’s and/or 8’s as summands:

14=343+8                15=34+343+4+3+4+3                  16=8+8

On the basis of these three results, we make the conjecture

For every n € Zt where n > 14,

S(n):       can be written as a sum of 3’s and/or 8’s.

Proof: It is apparent that the statements $(14), S(15), and $(16) are true
                                                                                           — and                this estab-
                   lishes our basis step. (Here np = 14 and n, = 16.)
                      For the inductive step we assume the truth of the statements

S(14), S15), ..., S(K — 2), S(k — 1), and S(k)

for some k € Zt, where k > 16. [The assumption of the truth of these (k — 14) + 1 state-
                   ments constitutes our induction hypothesis.] Andnowifn =k + 1,thenn > 17andk +1 =
                   (k — 2) +3.   But since 14 <k      —2<k,     from the truth of S(k — 2) we know             that (k — 2)
                   can be written as a sum of 3’s and/or 8’s; so (kK + 1) = (k — 2) +3           can also be written in
                   this form. Consequently, $() is true for all n > 14 by the alternative form of the Principle
                   of Mathematical Induction.

In Example 4.14 we saw how the truth of S(k + 1) was deduced by using the truth of the
                   one prior result S(k ~ 2). Our last example presents a situation wherein the truth of more
                   than one prior result is needed.

Let us consider the integer sequence dy, @;, 42, 43, ... , where
I   EXAMPLE 4.15
                                    ay = 1, a, = 2, a = 3,             and
                                    Gn = G@n—-1 + An_2 + Qn_3,         forall n © Z* wheren > 3.
                   (Then, for instance, we find that a3 = a2 + aj + a9 = 34+2+4+1=6;              a4 =a3       +a. + 4, =
                   64+342=1l,andas5 =a,+a34+            a = 11+6+43     = 20.)
                     We claim that the entries in this sequence are such that a, < 3” for all n € N— that                is,
                   Vn eéN S’(n), where S’(n) is the open statement: a, < 3”.
                     For the basis step, we observe that

i) dg = 1=3° <3°;
                      ii) a) =2<3      =3!'; and
                      iii) a7 =3 <9 = 3°.
                   Consequently, we know that S’(0), S’(1), and S’(2) are true statements.
                       So now we turn our attention to the inductive step where we assume the truth of the
                   statements S’(0), S’(1), S’(2),..., S’(k — 1), S'(k), for some k € Z* where k > 2. For
                   the case where n = k + 1 > 3 we see that

Ak+1 = Ap + Ag—1 + Ag—2

< 3k 4+ 3% 4 3% = 303%) = 3411,
                   so [S’(k — 2) A S(k — 1) A      S'(kK)] > S’(k +1).
208               Chapter 4 Properties of the Integers: Mathematical Induction

Therefore it follows from the alternative form of the Principle of Mathematical Induction
                                            that a, < 3" foralln EN.

Before we close this section, let us take a second look at the preceding two results. In
                                            both Example 4.14 and Example 4.15 we established the basis step by verifying the truth
                                            of three statements: $(14), S(15), and $(16) in Example 4.14; and, $’(0), S’(1), and S’(2)
                                            in Example 4.15. However, to obtain the truth of S(k + 1) in Example 4.14, we actually
                                            used only one of the (k — 14) + 1 statements in the induction hypothesis     — namely, the
                                            statement $(k — 2). For Example 4.15 we used three of the k + 1 statements in the induction
                                            hypothesis  — in this case, the statements S’(k — 2), S’(k — 1), and S’(k).

for i :=1to0123 do
                                                                                                                for j :=1ltoido
                                                                                                                  print i*j
  1. Prove each of the following for all n > 1 by the Principle
of Mathematical Induction.                                                                a) How many times is the print statement of the third line
                                              2n — 1)(2      1                            executed?
      a) 2432452 4---4+(Qn—12 = mee
                                                                                          b) Replace i in the second line by 7, and answer the ques-
      b) 1-342-44+3-54+---+n(n+2)                               =                         tion in part (a).
       n(n + 1)(2n + 7)                                                                6. a) For the four-digit integers (from 1000 to 9999) how
                 6                                                                        many are palindromes and what is their sum?
            -          {                n                                                 b) Write a computer program to check the answer for the
       ° DL iGtD a+)                                                                      sum in part (a).

fn              n(n     4   1)        n
                                                          2                              7. Alumberjack has 4n + 110 logs in a pile consisting of7 lay-
      d          p- -                       =         i                               ers. Each layer has two more logs than the layer directly above
                                                                                      it. If the top layer has six logs, how many layers are there?
  2. Establish each of the following for all                  > 1 by the Principle     8. Determine the positive integer 7 for which
of Mathematical Induction.

yieye
                                                                                                                  2n       n
            n                n—l

a) yo27              =yo2         =?" _]
                                                                                                                  1=]     i=]
           r=1               7=0

9, Evaluate each of the following:
      b)         i(2') =24+ (n—1)2""!                                                           33.                              33.   72
           i=1
                                                                                          a) ea!                           b)    duu        l-
                                                                                      10. Determine          10 t,, where ¢, denotes the ith triangular
                                                                                                             5°!"
       c) POE) =@+)I-1                                                                number, for 51 <1 < 100,
           1=1
                                                                                      11. a) Derive a formula for yn   t>,, where ft); denotes the 2ith
3. a) Note            how         YO, P4412                  =) ",G4+ 18 =               triangular number for 1 <7 <n.
       yo Fo         + 3i7 + 3i + 1). Use this result to obtain a for-
       mula for 5>"_, (2. (Compare with the formula given in                              b) Determine 571% h,.
      Example 4.4.)                                                                       c) Write acomputer program to check the result in part (b).
      b) Use the idea presented in part (a) to find a formula                         12. a) Prove that (cos @ + i sin 0)? = cos 26 + i sin 20,
       for yr        i> and one for yr               i+, [Compare    the result for       wherei € C andi* = —1.
       )-*_, 2 with the formula in part (d) of Exercise ! for this                        b) Using induction, prove that for all n € Z*,
       section.]
                                                                                                      (cos 8 +i sin @)" = cos n@ +i sinné.
  4. A wheel of fortune has the integers from | to 25 placed on it
in arandom manner. Show that regardless of how the numbers                                (This result is known as DeMoivre’s Theorem.)
are positioned on the wheel, there are three adjacent numbers                             c) Verify that 1 +7 = /2(cos 45° + i sin 45°), and com-
whose sum is at least 39.                                                                 pute (1 +i)!%.
5. Consider the following program segment (written in pseu-                          13. a) Consider an 8 X 8 chessboard. It contains sixty-four
docode):                                                                                  1 X 1 squares and one 8 X 8 square. How many 2 x 2
                                                                                                  4.1 The Well-Ordering Principle: Mathematical Induction                   209

squares does it contain? How many 3 X 3 squares? How                                              21. During the execution of a certain program segment (written
   many squares in total?                                                                            in pseudocode), the user assigns to the integer variables x and n
   b)    Now        consider              an   n Xn         chessboard       for   some   fixed      any (possibly different) positive integers. The segment shown in
   ne Z*. For             1<k         <n,      how         many     k X k squares are con-           Fig. 4.8 immediately follows these assignments. If the program
   tained in this chessboard? How many squares in total?                                             reaches the top of the while loop, state and prove (by mathe-
                                                                                                     matical induction) what the value assigned to answer will be
14. Prove that for alln € Z*,n>3>52"                                   <n!
                                                                                                     after the two loop instructions are executed n (> 0) times.
15. Prove that for alln € Z*,n >4=>n? <2".
16. a) Forn = 3 let X3 = {1, 2, 3}. Now consider the sum
            rr                                                                                                                 whilen # 0 do
                                                                                                                                begin
         5   —            —_          —          ——             —      ——           ———

75253                      57-275           4-352-3571.2-3
                                                                                                                                       xX   :=xX*n
                                 {
                                                                                                                                       n:i=n-l
           wxAcx, PA                                                                                                                 end
   where p denotes the product of all elements in anonempty                                                                     answer       :=xX
   subset A of X3. Note that the sum is taken over all the
   nonempty subsets of X3. Evaluate this sum.                                                                                Figure 4.8
   b) Repeat the calculation in part (a) for s; (wheren = 2 and                                      22. In the program segment shown in Fig. 4.9, x, y, and answer
   X> = {1, 2}) and sy (wheren = 4 and X4 = {1, 2, 3, 4}).                                           are real variables, and n is an integer variable. Prior to execu-
    c) Conjecture the general result suggested by the calcula-                                       tion of this while loop, the user supplies real values for x and y
    tions from parts (a) and (b). Prove your conjecture using                                        and a nonnegative integer value for n. Prove (by mathematical
    the Principle of Mathematical Induction.                                                         induction) that for all x, y € R, if the program reaches the top
17. For n € Z*, let H,, denote the nth harmonic number (as                                           of the while loop with n € N, after the loop is bypassed (for
defined in Example 4.9).                                                                             n = 0) or the two loop instructions are executed n (> 0) times,
                                                                                                     then the value assigned to answer is x + ny.
    a) For all 2 € N prove that 1 + (5) < Ann.
    b) Prove that for alln € Z*,

$n = [MEP] a [ROED].
               a.
             j=l
                                          n(n+        1)                 n(n
                                                                           t+ 1)                                                while n # 0 do
                                                                                                                                     begin
                                                                                                                                          xXi=xX+y
18. Consider the following four equations:
                                                                                                                                          n:=n-1
    1)                                                                1=1                                                            end
                                                                                                                                answer       :=xX
   2)                                           2+3+4=1+8
   3)                                      5+64+74+8+9=8+27                                                                  Figure 4.9
    4) 104+         114        12+ 134+               14+       15 + 16 = 27+ 64
                                                                                                     23. a) Let n € Z*, where n # 1, 3. Prove that n can be ex-
Conjecture the general formula suggested by these four equa-
                                                                                                         pressed as a sum of 2’s and/or 5’s.
tions, and prove your conjecture.
                                                                                                          b) For all       € Z* show that ifn > 24, thenn can be written
19. For n € Z*, let S(n) be the open statement
                                                                                                          as a sum of 5’s and/or 7’s.
                                Si = MEU’                                                            24. A sequence of numbers a, a2, a3, .. . is defined by
                                z=1                         2
                                                                                                               a, =   1      a, =2          Qn = An—-| + QAn_2,n
                                                                                                                                                              > 3.
Show that the truth of $(k) implies the truth of S(k + 1) for all
                                                                                                          a) Determine the values of a3, @4, a5, ds, and a7.
k €Z*. Is S(n) true for alln € Z*?
                                                                                                          b)   Prove that for all n > 1, a, < (7/4)".
20. Let S; and S; be two sets where |5,| = m, |S:| =r, for
m,r €Z*, and the elements in each of S;, S> are in ascend-                                           25. For a fixed n € Z*, let X be the random variable where
ing order. It can be shown that the elements in S; and S$, can be                                     Pr(X =x)= i, x = 1,2,3,...,”. (Here X is called a uni-
merged into ascending order by making no more thanm + r — 1                                          form discrete random variable.) Determine E(X) and Var(X).
comparisons. (See Lemma 12.1.) Use this result to establish the                                      26. Let     ay   be   a fixed    constant      and,   for   n>   1, let a, =
following.                                                                                            yo (Farag.
    Forn > 0, let S be aset with |S| = 2”. Prove that the number
                                                                                                          a) Show that a, = aj and that a2 = 2a;.
of comparisons needed to place the elements of S in ascending
                                                                                                          b)   Determine a3 and a, in terms of ap.
order is bounded above by n - 2”.
210           Chapter 4 Properties of the Integers: Mathematical Induction

c) Conjecture a formula for a, in terms of dg when n > 0.               28. a) Of the 2°-' = 2+ = 16 compositions of 5, determine
      Prove your conjecture using the Principle of Mathematical                   how many start with (1) 1; (11) 2; (iii) 3; (iv) 4; and (v) 5.
      Induction.                                                                  b) Provide a combinatorial proof for the result in part (a)
27. Verify Theorem 4.2.                                                           of Exercise 2.

4.2
                Recursive Definitions
                                Let us start this section by considering               the integer sequence       bo, by, bz, b3,...,     where
                                b, = 2n for alln € N. Here we find thatbb = 2.0                     =0, b) =2-1=2,        bo) =2-2= 4, and
                                b; = 2-3 = 6. If, for instance, we need to determine be, we simply calculate bg = 2-6 =
                                12 — without the need to calculate the value of b, for any other n € N. We can perform
                                such calculations because we have an explicit formula— namely, b, = 2n — that tells us
                                how b,, is determined from n (alone).
                                   In Example 4.15 of the preceding section, however, we considered the integer sequence
                                ao, aj, 42, 43,..., where

ao   =   1,4,   =2,   a     =3,         and

Gn = An—-1 t@n-2 + aG,-3,               foralln € Z* where n > 3.
                                Here we do not have an explicit formula that defines each a, in terms of v for all vn EN.
                                If we want the value of ag, for example,                we need to know       the values of as, a4, and a3.
                                And these values (of a5, a4, and a3) require that we also know the values of a2, a), and ag.
                                Unlike the rather easy situation where we determined bg = 2 - 6 = 12, in order to calculate
                                ae, here we might find ourselves writing

ag = a5 + a4
                                                              + 43
                                                             (a4 + a3 + a2) + (a3 + G2 +.) + (a2 + a) + a)
                                                         = [(a3 + a. +. 4)) + (a2 + a + G0) + 2)
                                                             + [(a2
                                                                 + a) + ao) + a2 + ay) + (a2 + a; + a0)
                                                         = [[(a2 + a) + ao) + a2 + ai) + (2 +.                 + 0)
                                                                                                                  + a2]
                                                             + [(a2 + a + ao)
                                                                            + a2 + ay) + (a2 + G1 + a0)
                                                         =[(34+24+1)+3+2)4+8+2+4+1)4+3]
                                                            +[((334+2+1)4+342]4+ (34241)
                                                         = 37.
                                Or, in a somewhat easier manner, we could have gone in the opposite direction with these
                                considerations:
                                                               a3 =a, +a; +a =34+24+1=6
                                                               a4=a3+a.+a,                =64+342=11
                                                               a5   =~ a4 +a3 +a,            =114+6+4+3
                                                                                                    = 20
                                                               ag
                                                                = as +a4,+a3              = 204+    1146
                                                                                                       = 37.

No matter how       we arrive at ds, we realize that the two integer sequences
                                                                                                            — bo, bj, bo,
                                b3,..., and ao, a), @, a3, ...—are more than just numerically different. The integers
                                bo, b1, bo, b3, ..., can be very readily listed as 0, 2, 4, 6, ..., and for any 1 € N we have
                                                                               4.2   Recursive Definitions       211

the explicit formula b, = 2n. On the other hand, we might find it rather difficult (if not
               impossible) to determine such an explicit formula for the integers ag, @,, a2, 43, ..

What is happening here for a sequence of integers can also occur for other mathematical
               concepts — such as sets and binary operations [as well as functions (in Chapter 5), languages
               (in Chapter 6), and relations (in Chapter 7)]. Sometimes itis difficult to define a mathematical
               concept in an explicit manner. But, as for the sequence ay, a, a2, a3, ... , we may be able
               to define what we need in terms of similar prior results. (We shall examine what we mean by
               this in several examples in this section.) When we do so we say that the concept is defined
               recursively, using the method, or process, of recursion. In this way we obtain the concept
               we are interested in studying — by means of a recursive definition. Hence, although we do
               not have an explicit formula here for the sequence ao, a), 42, a3, ..., we do have a way of
               defining the integers a,, for n € N, by recursion. The assignments

ago = 1,      a,   = 2,         a, =3

provide a base for the recursion.
                  The equation

Gn = Gn-1 +Gn-2+ Gn-3,              forn € Z* wheren > 3,                        (*)

provides the recursive process; it indicates how to obtain new entries in the sequence from
               those prior results we already know (or can calculate). [Note: The integers computed from
               Eq. (*) may also be computed from the equation @,43 = Qn42 + Gn41 + @n, forn €N.]

We now use the concept of the recursive definition to settle something that was mentioned
               in three footnotes in Sections 2.1 and 2.3. After studying Section 2.2 we knew (from the
               laws of logic) that for any statements p;, p2, and p3, we had

Di A (p2 A p3) =          (pi A pr) A Ps,

and, consequently, we could write p; A p2 A p3 without any chance of ambiguity. This is
               because the truth value for the conjunction of three statements does not depend on the way
               parentheses might be introduced to direct the order of forming the conjunctions of pairs
               of (given or resultant) statements. But we were concerned about what meaning we should
               attach to an expression such as p; A po A p3 A pq. The following example now settles that
               issue.

The logical connective A was defined (in Section 2.1) for only two statements at a time.
EXAMPLE 4.16
               How, then, does one deal with an expression such as py A po A p3 A p4, where pi, P2, D3,
               and p, are statements? In order to answer this question we introduce the following recur-
               sive definition, wherein   the concept at a certain [(n + 1)st] stage is developed            from the
               comparable concept at an earlier [nth] stage.
                  Given any statements p,, P2,..., Pa,» Pasi, We define

1) the conjunction of p), p2 by p; A po (as we did in Section 2.1), and
                  2) the conjunction of pj, p2,.-.. Pn» Pn4i, forn > 2, by

Pi A P2 A+++ A Pa A Pasi        <=     (PLA pr2 A+++ A Pa) A Pasi.
212   Chapter 4 Properties of the Integers: Mathematical Induction

[The result in (1) establishes the base for the recursion, while the logical equivalence in (2)
                       is used to provide the recursive process. Note that the statement on the right-hand side of
                       the logical equivalence in (2) is the conjunction of two statements: p,+) and the previously
                       determined statement (p; A p2 A--+A pn).]
                           Therefore, we define the conjunction of p;, p2, p3, ps by

P| A p2 A p3 A pa <=       (pi A pr A p3) A Ps.

Then, by the associative law of A, we find that

(p1 A p2 A ps) A ps <=> [(pi A pa) A p3) A pg
                                                                      <> (pi A pr) A (p3 A pa)
                                                                      <= Pi A[p2 A (p3 A pa))
                                                                      <= Pi \[(p2 A p3) A Pal
                                                                          = Pi \(p2 A p34 pa).
                       These logical equivalences show that the truth value for the conjunction of four statements
                       is also independent of the way parentheses might be introduced to indicate how to associate
                       the given statements.
                           Using the above definition, we now extend our results to the following “Generalized
                       Associative Law for A.”
                           Let n € Z* where n > 3, and letr € Z* with 1 <r <n. Then

S(n):     For any statements pj, po,-..,      Pr, Prtis +--+» Dns

(Pi A prA+++A Pr) A (Pr4t Att               A Pn)   > Pi A Pr A-++A Pr A Prat A+++ A Pn:
                        Proof: The truth of the statement 5$(3) follows from the associative law for A and
                                                                                                       —     this
                        establishes the basis step for our inductive proof. For the inductive step we assume that
                        S(k) is true for some k > 3 and all 1 <r < k. That is, we assume the truth of

S(k):     (py A pa A+++ A Pr) A rg            Avs A Pe)
                                                                               <=   Pi \ Pz A+++ A Py WS Pro A+++ A Pk.

Then we show that $(k) = S(k + 1). When we consider k + 1 statements, then we must
                        account for all 1 <r      <k +1.

1) Ifr =k, then

(pi A p2 A+++ A Pe) A Peoi <> Pi A po A+++ A PEA Pests

from our recursive definition.
                           2) For 1 <r     <k, we have

(pi A po A-++ A pr) A (Prat A+++ A Pk A Pei)
                                                                 <=   (Pi A prAQ-++A Pr) A[(Prti Aves A Pe) A Pesil
                                                                 <=   [Cpr A p2 A+++ A pr) A (prt Ao          A PRIA Peo
                                                                 <=   (Pi A pr A+++ A Pr A Prat Ao       A Pk) A Prt
                                                                 =    PLN Pr2AQ°°++ A Pr A Prat Ao      A PKA Pret:
                                                                                                   4.2   Recursive Definitions                   213

So it follows by the Principle of Mathematical Induction (Theorem 4.1) that the open
               statement S(n) is true for all n € Z* where n > 3.

Our next example provides us with a second opportunity to generalize an associative
               law — but this time we shall deal with sets instead of statements.

In Definition 3.10 we extended the binary operations of U and M to an arbitrary (finite or
EXAMPLE 4.17   infinite) number of subsets from a given universe U. However, these definitions do not rely
               on the binary nature of the operations involved, and they do not provide a systematic way
               of determining the union or intersection of any finite number of sets.
                  To overcome this difficulty, we consider the sets A;, A2,...,                                   An, Any1, where A; CU
               for all 1 <7 <n-+        1, and we define their union recursively as follows:

1) The union of A;, Az is A; U Ap. (This is the base for our recursive definition.)
                  2) The union of Ay, Az,..., An, Anyi, for n > 2, is given by

A,    UA,     U---UAy     U   Angy    =   (Ay      U       Ad     U->+   + U    An)   U   Angi,

where the set on the right-hand side of the set equality is the union of fwo sets,
                       namely, A; U Az U---U         A, and A,4,. (Here we have the recursive process needed
                       to complete our recursive definition.)
                  From this definition we obtain the following “Generalized Associative Law for U.” If
               n,r €Z* withn    > 3and1<r <n, then

S(n):     (A; UA. U---UA,)
                                    U (Apa U- ++ U An)
                                                                             =   A|       U    Az U---UA,                UA;         U+   ++ U   Ag,

where A; © U for all 1 <i <n.
               Proof: The truth of S(”) for n = 3 follows from the associative law of U, thereby providing
               the basis step needed for this inductive proof. Assuming the truth of S(k) for some k € Z*,
               where k > 3 and 1 <r <k, we shall now establish our inductive step by showing that
               S(k) => S(k + 1). When dealing withk + 1 (> 4) sets we need toconsideralll <r <k +1.
               We find that

1) Forr    = k we have

(A,    U Ad    U+-+U    Ag)   U   Aggy,   =   Ay       U    Ad U ++        -U    AKU      Aga.

This follows from the given recursive definition.
                  2) If 1 <r<k, then

(Ay UA? U-+-UA,) U (Apa UU Ag U Aga)
                                              = (A, U Ap U---UA,) U[(Apay Us                                                   U Ag) U Aga]
                                                          = [(A, U A2 U---UA,) U (A, 41 UO                                           Ag) U Aga
                                                          =(AyUAU---UA,U      Apgy Us U Ag) U Aga
                                                          = AyU Ax U++-U A, U Apa Us =U Ag U Aga.
214         Chapter 4 Properties of the Integers: Mathematical Induction

So it follows by the Principle of Mathematical Induction that S(n) is true for all integers
                             n > 3.

Similar to the result in Example 4.17, the intersection of the n + 1 sets Aj, Az, ..., An,
                              An+1 (each taken from the same universe UW) is defined recursively by:
                                 1) The intersection of A;, Az is Ay M Ad.
                                 2) Forn > 2, the intersection of A;, A2,...,                    An, An41   is given by

A, MA2M+++
                                                             An MN Angi = (AiO A2N-+++M
                                                                                   An) ON Angi,

the intersection of the two sets Ay M1 A2M---M Ay and Ay4+1.

We find that the recursive definitions for the union and intersection of any finite number of
                             sets provide the means by which we can extend the DeMorgan Laws of Set Theory. We shall
                             establish (by using mathematical induction) one of these extensions in the next example
                             and request a proof of the other extension in the Section Exercises.

Let
                               n € Z* where n > 2, and let A,, Ao, ..., An CU for each 1 <i <n. Then
      EXAMPLE 4.18
                                                          Ai MN A2N+++N
                                                                  Ay = Ay U Ap U-+ + UA.
                             Proof: The basis step of this proof is given for = 2. It follows from the fact that Ay M Az =
                              A, U Az —by the second of DeMorgan’s Laws (listed in the Laws of Set Theory in Sec-
                              tion 3.2).
                                 Assuming the truth of the result for some k, where k > 2, we have

Ai MN A2M->+Ag = AyU AQ U- + U Ag,
                              And when we consider k + 1 (> 3) sets, the induction hypothesis is used to obtain the third
                              set equality in the following:

Ay    MA2M+-+O      Ag    A   Aggy    =   (Ay       A216   +   OAR)      Aga

= (A;      MN AM-+--MAx)U          Aggy       =   (A;   UAdU---U      Ag)    U Aga

=   A,    UA2U:--    UA,      U Aga:

This then establishes the inductive step in our proof and so we obtain this generalized
                              DeMorgan Law for all n > 2 by the Principle of Mathematical Induction.

Now that we have seen the two recursive definitions (in Examples 4.16 and 4.17), as
                              we continue to investigate situations where this type of definition arises, we shall generally
                              refrain from labeling the base and recursive parts. Likewise, we may not always designate
                              the basis and inductive steps in a proof by mathematical induction.

As we look back at Examples               4.16 and 4.17, the recursive definitions in these two
                              examples should seem similar to us. For if we interchange the statement p; with the set Aj,
                              for all 1 <i <n-+        1, and if we interchange each occurrence of A with U and replace <>
                              with =, then we can obtain the recursive definition in Example 4.17 from the one given in
                              Example 4.16.

In a similar way one can recursively define the sum and product of n real numbers,
                              where n € Z* and n > 2. Then we can obtain (by the Principle of Mathematical Induction)
                              generalized associative laws for the addition and multiplication of real numbers. (In the
                                                                                   4.2    Recursive Definitions        215

Section Exercises the reader will be requested to do this.) We want to be aware of such
               generalized associative laws because we have been using them and will continue to use them.
               The reader may be surprised to learn that we have already used the generalized associative
               law of addition. In each of Examples 4.1 and 4.4, for instance, the generalized associative
               law of addition was used to establish the inductive step (in the proof by mathematical
               induction). Furthermore, now that we are more aware of it, the generalized associative law
               of addition can be used (usually, in an implicit manner) in recursive definitions — for now
               there will be no chance for ambiguity if one wants to add four or more summands. For
               example, we could define the sequence of harmonic numbers H,, H2, H3,...,                          by

1) A,     = 1; and
                  2) Forn > 1, Angi = Hn + (s5)-
               Turning from addition to multiplication, we may use the generalized associative law of
               multiplication to provide a recursive definition of n!. In this case we write
                  1) 0! = 1; and
                  2) Forn > 0, (24+ 1)! = (nF 1)(n!).

(This was suggested in the paragraph following Definition 1.1 in Section 1.2.) Also, the
               integer sequence bo, b,, bz, b3,...,            given explicitly (at the start of this section) by the
               formula b, = 2n, n € N, can now be defined recursively by

1)   bo   =   0; and

2) Forn > 0, bay)             = by +2.

When we investigate the sequences in our next two examples, we shall once again find
               recursive definitions. In addition we shail establish results where the generalized associative
               law of addition will be used — although in an implicit manner.

In Section 4.1 we introduced the sequence of rational numbers called the harmonic numbers.
EXAMPLE 4.19
               Now we introduce an integer sequence that is prominent in combinatorics and graph theory
               (and that we shall study further in Chapters 10, 11, and 12). The Fibonacci numbers may
               be defined recursively by
                  1)   Fo = 0, F; = 1; and
                  2) F, = Fi, + F,_-2, forn € Z* with n > 2.

Hence, from the recursive part of this definition, it follows that

fy     =F,    +   Fp   =14+0=1          Fy   = F3+     Fy =24+1=3

Py     =   F,4+   F,   =14+1=2          fs =   Fyt+    F3   =342=5.

We also find that Fg = 8, F7 = 13, Fg = 21, Fo = 34, Fin = 55, Fi, = 89, and Fy = 144.
                   The recursive definition of the Fibonacci numbers can be used (in conjunction with the
               Principle of Mathematical Induction) to establish many of the interesting properties that
               these numbers exhibit. We investigate one of these properties now.

Let us consider the following five results that deal with sums of squares of the Fibonacci
               numbers.

1) Fe+FP7=04+1°=1=1X1
                  2) Fo + FF + F5 =0 412412 =2=1X2
                  3) FO + FP + FS + FG = 074407427?
                                                 =6=2X3
216        Chapter 4 Properties of the Integers: Mathematical Induction

4) P24 R24 FF 4 FR 4 FF =P 4¢P4+ P4243          =15=3X5
                                5) Fo + FO+ FR4+ FR 4+ FP + RR H=0C4+ P44 V4 P49 45° =40=5X8
                             From what is suggested in these calculations, we conjecture that

Wn eZt )) FP = Fy X Frat.
                                                                                        i=0

Proof: For n = 1, the result in Eq. (1) —namely, FS + F ; = ] X 1— shows us that the
                             conjecture is true in this first case.
                                 Assuming the truth of the conjecture for some k > 1, we obtain the induction hypothesis:

>
                                                                                 S> F? = Fy X Frat.
                                                                                 i=0
                                Turning now to the case where n = k + 1 (> 2) we find that
                              k+\                k

Fo = 3 F* 4 Fe,, = (Fe X Fei) + Fey, = Feri X (Fe + Fest) = Fess X Fete.
                              i=0               i=0
                             Hence the truth of the case for n = k + 1 follows from the case for n = k. So the given
                             conjecture is true for all n € Z* by the Principle of Mathematical Induction. (The reader
                             may wish to note that the prior calculation uses the generalized associative law of addition.
                             Furthermore we employ the recursive definition of the Fibonacci numbers; it allows us to
                             replace       Ft         Fry    by    Fy42.)

Closely related to the Fibonacci numbers is the sequence known as the Lucas numbers. This
__    EXAMPLE   4.20   |     sequence is defined recursively by
                                    1)    Lo   = 2,     Ly   =    1; and

2) Ly = Lyi + Ly_2, for n € Z* with n > 2.
                                    The first eight Lucas numbers are given in Table 4.2

Table 4.2

n      |}O;1/2/3/4]
                                                                                    5] 6| 7

L, |) 2} 1/3            )4 ]7 4] 11 | 18 | 29

Although they are not as prominent as the Fibonacci numbers, the Lucas numbers also
                              possess many interesting properties. One of the interrelations between the Fibonacci and
                              Lucas numbers is illustrated in the fact that
                                                                               Vn eZ!     Ly =     Fai t+ Fai.

Proof: Here we need to consider what happens when n = | and n = 2. We find that

L,=1=04+1=)+h=Fi-i1+Fisi,                            and
                                                                 22=3=142=F,4+%
                                                                          = Fit Fos,
                              so the result is true in these first two cases.
                                                                                                                                       42    Recursive Definitions                  217

Next   we    assume            that L, =                        F,_; + F,4,                   for the            integers        n = 1,2,3,...,k—1,k,
               where k > 2, and then we consider the Lucas number Z,,,. It turns out that

Lest = Le + bee = Pei + Peas) + (Pa-2 + Fr)                                                                                                                   (*)
                                     = (Fy. + Fye_-2) + (Fea + Fh) = Fe + Fag = Feta                                                                                   + Feeanai-
               Therefore, it follows from the alternative form of the Principle of Mathematical Induction
               that L, = F,-1 + Fy41 foralln € Z*. [The reader should observe how we used the recursive
               definitions for both the Fibonacci numbers and the Lucas numbers in the calculations at
               (*).]

In Section 1.3 we introduced the binomial coefficients (”) for n,r ¢ N, where n > r >
EXAMPLE 4.21
               0. Corollary         1.1 in that section revealed                                          that yo                   (") =          ar        C(n, r) = 2”, the total
               number of subsets for a set of size n. With the help of the result in Example 3.12 we can

(2) Ce) eres
               now define these binomial coefficients recursively by

()-! (nen (na et
               At this time we present a second set of numbers, each of which is also dependent on two
               integers. For m, k € N, the Eulerian numbers a, , are defined recursively by

am k    =    (m    a           K)Qm—1,k-1                     +      (k +       L)Gm—1.k,              0 <     k =m           —   1,               (*)

ao0     =    l,                    an k = 0,                         k>m,                      an k = 0,                k<0O.

(In Exercise 18 of the Section Exercises we shall examine a situation that shows how this
               recursive definition may arise.) The values for a,,,, where 1 <m<S5and0<k<m-—1,
               are given as follows:

Row Sum
                               (m = 1)                                                                     1                                                           1=1!
                               (m = 2)                                                        1                       1                                                2=2!
                               (m = 3)                                             1                       4                    l                                      6 = 3!
                               (m = 4)                               1                        11                     11                 1                           24
                                                                                                                                                                     = 4!
                               (m = 5)                  1                         26                      66                   26              1                   120 = 5!
                                                                                                                          —]                                                        :
               These results suggest that for a fixed m                                       € Z*,            S° 10           Am’     = m!, the number of permutations
               of m objects taken m at a time. We see that the result is true for 1 < m <5. Assuming the
               result true for some fixed m (> 1), upon using the recursive definition at (*), we find that

So amie =                   [Om +1 = ban ea + (kK + Dam x]
                 k=0                 k=0

=    [(m+         1)@m,-1                + ano}               + [amo                 + 241]             +   [(m         _    Dam,       + 3am.2] +:--

+     [34m m—3             +        (m       —     1)a@m m2)                +    [24m m-2          +    Mam            m—1]

+     lQinm—!          +       (m        +        L)@n        mn).
218         Chapter 4 Properties of the Integers: Mathematical Induction

Since ay;       = 0 = amm we can write
                                            Wt

>      Om+1,4   =   [Gino + Mano]           + [2dm 4 + (m         —       1)ain,1) +**:
                                           k=0

+   [(m    —    1) @in,m—2    +   2Qm,m—-2)       +    [Mainm—1   +   Amm-1|
                                                                             m1

=(m +1)              > amg = (m + Lym! = (m + 1)!
                                                                             k=0

Consequently, the result is true for all m > 1 —by the Principle of Mathematical Induction.
                              (We’ll see the Eulerian numbers again in Section 9.2.)

In closing this section we shall introduce the idea of a recursively defined set X. Here we
                              start with an initial collection of elements that are in X —-and this provides the base of the
                              recursion. Then we provide a rule or list of rules that tell us how to find new elements in
                              X from other elements already known to be in X. This rule (or list of rules) constitutes the
                             recursive process. But now (and this part is new) we are also given an implicit restriction —
                             that is, a statement to the effect that no element can be found in the set X except for those
                             that were given in the initial collection or those that were formed using the prescribed rule(s)
                             provided in the recursive process.
                                 We demonstrate the ideas given here in the following example.

Define the set X recursively by
      EXAMPLE 4.22
                                 1) 1 e X; and
                                 2) Foreacha € X,a+2€                   X.

Then we claim that X consists (precisely) of all positive odd integers.
                              Proof: If we let Y denote the set of all positive odd integers — that is, Y = {2n + 1|n € N}—
                              then we want to show that Y = X. This means, as we learned in Section 3.1, that we must
                              verify both Y C X and X CY.
                                  In order to establish that Y C X, we must prove that every positive odd integer is in X.
                              This will be accomplished through the Principle of Mathematical Induction. We start by
                              considering the open statement

S(n):        2n4+1eXx,

which is defined for the universe N. The basis step —that is, S(O) —1s true here because
                                 = 2(0) + 1 € X by part (1) of the recursive definition of X. For the inductive step we
                              assume the truth of S(k) for some k > 0; this tells us 2k + 1 is an element in X. With
                              2k + 1 € X it then follows by part (2) of the recursive definition of X that (2k + 1) +2 =
                              (2k +2) +1= 2(k +1)+1€X,so0 S(k + 1) is also true. Consequently, S() is true (by
                              the Principle of Mathematical Induction) for all n € N and we have Y C X.
                                  For the proof of the opposite inclusion (namely, X C Y) we use the recursive defini-
                              tion of X. First we consider part (1) of the definition. Since 1 (= 2-0+ 1) is a positive
                              odd integer, we have | € Y. To complete the proof, we must verify that any integer in X
                              that results from part (2) of the recursive definition is also in Y. This is done by show-
                              ing that a + 2 € Y whenever the element a in X is also an element in Y. For ifaeY,
                              then a = 2r +1, where r e N—this by the definition of a positive odd integer. Thus
                                                                                                                           4.2    Recursive Definitions                    219

a+2=(2r+1)4+2=(2r+2)+1=2(r4+1)4+1,                                      where      r+16€N             (actually, Z*),
                                                   and so a + 2 is a positive odd integer. This places a + 2 in Y and now shows that X CY.
                                                      From the preceding two inclusions — that is, Y C X and X C Y —itfollows that X = Y.

7. Use the result of Example 4.17 to show                         that if sets
                                                                                          A, B,, Bo,..., B, C Wandn > 2, then

1. The integer sequence a, a2, a3, ..., defined explicitly by                           AM (B, UB, U---UB,)
the formula a, = 5n forn € Z*, can also be defined recursively                                                     = (AN Bi) U(AN B)U---U(ANB,).
by                                                                                         8. a) Develop a recursive definition for the addition of 7 real
        1) a, = 5; and                                                                        numbers x), %2,..., X,, where n > 2.

2) Gn41 =,         +5, forn > 1.                                                         b) For all real numbers x,, x2, and x3, the associative law of
                                                                                                 addition states thatx; + (x2 + x3) = (x; + x2) + x3. Prove
       For     the    integer        sequence        }), b2, b3,...,    where     b, =           that if, r € Z*, wheren > 3 and1 <r <n, then
n(n + 2) for alln € Z*, we can also provide the recursive def-
inition:
                                                                                                 (1    XQ he      FX)       tH Org Ho            + Xn)
                                                                                                                        =X, A xX2 +         EX      AKL          be         Xp.
        1) b, = 3; and
                                                                                           9.    a)   Develop a recursive definition for the multiplication of
        2) baa. = b, + 2n +3, forn >                       1.
                                                                                                 n real numbers x), %3,..-,X%,, where n > 2.
Give a recursive definition for each of the following integer                                    b) For all real numbers x,, x2, and x3, the associative law
sequences ¢C), C2, C3, ..., where for all n € Z* we have                                         of multiplication states that x; (%2x3) = (%|X2)x3. Prove that
       a) c, = 7n                                    b) c, = 7"                                  ifn, r € Zt, wheren > 3 and | <r <n, then
        c) c, =3n+7                                  d) c, =7                                         (X1X2 +6 XM Nr pL          Xn) SHA.            XP p 1           Xn
       ec,       =n                                  f) c, =2—(-1)"                       10.    For all x € R,
2. a) Give a recursive definition for the disjunction of state-                                                                      if x >
                                                                                                         |x) =Vx*2=         *         itx 20         .     and
        ments p1, P2,.--5 Pa» Psi, a= 1.                                                                                    —x,       ifx<0
       b) Show that ifn, r € Z*, withn > 3 and 1 <r <n, then                              —|x|<x<|x|. Consequently, |x + y|? = (x+y)? =x? 4+
        (piv prV-r+V
               pr) V (Prtt Vor Y Pn)                                                      Qxy ty? <x? + 2\x\ly| + y? = |x? + 2lxllyl + ly? =
                                                                                          (x| + |y))?, and [x + y|/? < x] + ly)? > lx tyl <
                                =     PIV P2V-°°V
                                             Pr V Prat Vos + VY Pre
                                                                                          |x| + |yl, for all x, y ER.
  3. Use the result of Example 4.16 to prove that if p, qi, q2,                               Prove that ifn € Z*, n > 2, and x1, .x2,...,X, €R, then
.. +) Gn are statements and n > 2, then
                                                                                                      [xy Hx. +e         + x_] < ley] + x2] Fe             + [xnl.
PY (1 AG2 A+++
             AGn)
                                                                                          11. Define the integer sequence ag, a), a2, a3, ... , recursively
                                 =         (PV 4) A (PV G2) A+            A(PY Gn):
                                                                                          by
4, For n € Z*,n > 2, prove that for any statements pj), p2,
«+5    Pas
                                                                                                 1) a = 1, a, = 1, a2 =          J; and

V Pn) SPL                 A mp2 Avs App.                  2) Forn > 3, ay = Gy—1 + Gy_3.
       a) 4(P. V pP2V-oe
       b) -(p1 A p2 A+++ A Pn)                       > TPL V Tp2 VV              apne     Prove that @,42 > (/2)" for all n > 0.
5. a)        Give    a recursive          definition     for the intersection   of the
                                                                                          12. For n > 0 let F, denote the nth Fibonacci number, Prove
       sets    At,   Ad,   say       An,    An+t    CU,     n>    1.
                                                                                          that
       b) Use the result in part (a) to show that for all n, r ¢ Z*
       withn > 3 andi <r<n,                                                                           Rth+ht-e-+h
                                                                                                            = R= Ppl.
       (A, M1 A2N---MA,)
                   NM (Apa) ++ Ay)                                                                                                        i=0
                                 = A, NAN: +> NA, MN Ara, 1-2                      Ap.    13. Prove that for any positive integer n,
                                                                                                                    n
6. For n > 2 and any sets Ay, A2,...,
                                     A, CU, prove that                                                                    Fy      _       Fn42

A,
               U---UA,
                 UA} = A, MN A2N-+--MAg.-                                                                          i=1     2!               2"
220                     Chapter 4 Properties of the Integers: Mathematical Induction

14. As in Example                  4.20 let Lp, £1, £2, ... denote      the Lucas          d) Suppose a permutation of 1, 2, 3, ..., m has k ascents,
numbers, where (1) Lo = 2, Ly; = 1; and (2) Lyjso = Lagi +                                 for 0 < k <m — 1. How many descents does the permuta-
L,, forn > 0. When n > 1, prove that                                                       tion have?

LIAL        + L54++-- 412 = Labay — 2.                               e) Consider the permutation p = 12436587. This permu-
                                                                                           tation of 1, 2, 3,..., 8 has four ascents. In how many of
15. Ifn EN, prove that 5Fi4. = Lassa — Ly.                                                 the nine locations (at the start, end, or between two num-
                                                                                           bers) in p can we place 9 so that the result is a permutation
16. Give a recursive definition for the set of all
                                                                                           of 1, 2,3, ..., 8, 9 with (i) four ascents; (11) five ascents?
          a) positive even integers                                                        f) Let z,,, denote the number of permutations of 1, 2, 3,
          b) nonnegative even integers                                                         ,m with k ascents. Note how 242 = 11 = 2(4)+
17. One of the most common uses for the recursive definition                               3(1) = (44 — 2)m3,, + (2+ 1)73.2. How is 2», related to
of sets is to define the well-formed formulae in various math-                             Tm—1.k—1 ANd Tt ke?
ematical systems. For example, in the study of logic we can                          19.   a) Fork € Z* verify that k? = (§) + (*$').
define the well-formed formulae as follows:
                                                                                           b) Fix # in Z*. Since the result in part (a) is true for all
          1) Each primitive statement p, the tautology 7), and the                         k=    1,2,3,...,n,    summing the n equations

r=()+()
          contradiction Fo are well-formed formulae; and
          2) If p, g are well-formed formulae, then so are
                 i) (-p)                     ii) (pV q)         iii) (p Aq)
                iv) (p> q)                   v) (p>)
Using this recursive definition, we find that for the primitive
                                                                                                               *=()+()
statements p, g,r, the compound statement ((p A (-q)) >
                                                                                                                >_       {Nn         n+]
(r V Ty)) is a well-formed formula. We can derive this well-
formed formula as follows:                                                                                     u (:) ¥( 2
Steps                                            Reasons                                   we have Yip-,        = Vie (2) + Vian OF) = C5) +
1) p.g, 7. To                                    Part (1) of the definition                ("3°). [The last equality follows from Exercise 26 for
2) (-q)                                          Step (1) and part (21)                    Section 3.1 because )°7_, (6) = G)+G)4+@)4+---+
                                                    of the definition                      (5)=0+ ()+G)+---+
                                                                                                           6.22) = G2) = (3') and
3) (pA (-@))                                     Steps (1) and (2) and part (2iii)         Xie (3) = G+Q)+G@)t--+C3')=@+
                                                    of the definition                      G)+ (3) 4+---4+ G41) = (77)= (°3). Show that
4)   (r    V    To)                              Step (1) and part (211)
                                                                                                      n+]         n+2\             n(n+1)Qn+1)

5) ((p A (-9g)) >                  Vv Th)
                                                    of the definition
                                                 Steps (3) and (4) and part (2iv)                     ( 3 )t ( 3 ) ~                        6
                                                    of the definition                      c) Fork € Z* verify that k? = (4) + 4(°$') + (*4°).
For the primitive statements p, q, r, and s, provide derivations                           d) Use part (c) and the results from Exercise 26 for Section
showing that each of the following is a well-formed formula.                               3.1 to show that
                                                                                                         n+]             n+2            n+3          n(n +1/
          a) (pV q) > (Ip A (71r)))                                                             ie=                  4                            = —_
          b) ((-p)                4) > (ACS Y Fo)))
                                                                                           »            (";     . ( 4 )+( 4 )                             4
18. Consider the permutations of 1, 2, 3, 4. The permutation                              e) Find      a,b,c,d€Z*             so that for any    ke Zt, kt =
1432,         for instance,       is said to have    one   ascent—namely,      14         a(t) +(S') +f 4h?) +a(*G?).
(since | < 4). This same permutation also has two descents —                         20.   a) Forn > 2, if pi, p2, p3,.-. > Pn» Pn+              are Statements,
namely, 43 (since 4 > 3) and 32 (since 3 > 2). The permutation                             prove that
1423, on the other hand, has two ascents, at 14 and 23 — and
                                                                                           [(pi > Pr) A (p2 > ps) A+++ A (Pn > Pa+i)]
the one descent 42.
          a) How many permutations of 1, 2, 3 have & ascents, for
                                                                                                                = [(p1 A p2 A pa A+++ A Pr) > Pnsil-
          k =0, 1, 2?                                                                      b)   Prove that Theorem 4.2 implies Theorem 4.1.

b)    How many permutations of 1, 2, 3,4 have & ascents, for                     c) Use Theorem 4.1 to establish the following: If 4 #
          k =0,        1, 2, 3?                                                            SCZ", so that n € S for some n € Z*, then S$ contains a
                                                                                           least element.
          c) If a permutation of 1, 2, 3, 4, 5, 6, 7 has four ascents,
          how many descents does it have?                                                  d) Show that Theorem 4.1 implies Theorem 4.2.
                                                                      4.3 The Division Algorithm: Prime Numbers      221

4.3
The Division Algorithm: Prime Numbers
                      Although the set Z is not closed under nonzero division, in many instances one integer
                      (exactly) divides another. For example, 2 divides 6 and 7 divides 21. Here the division is
                      exact and there is no remainder. Thus 2 dividing 6 implies the existence of a quotient—
                      namely, 3— such that 6 = 2 - 3. We formalize this idea as follows.

Definition 4.1     Ifa, b € Zand b ¥ 0, we say that b divides a, and we write b|a, if there is an integer n such
                       that a = bn. When this occurs we say that b is a divisor of a, or a is a multiple of b.

With this definition we are able to speak of division inside of Z without going to Q.
                      Furthermore, when ab = 0 for a, b € Z, then either a = 0 or b = 0, and we say that Z has
                      no proper divisors of 0. This property enables us to cancel as in 2x = 2y > x = y, for
                      x, y € Z, because  2x = 2y =} 2(x — y) =0>2=O0orx —y =03x = y. (Note that at
                      no time did we mention multiplying both sides of the equation 2x = 2y by t The number
                      ; is outside the system Z.)
                          We now summarize some properties of this division operation. Whenever we divide by
                      an integer a, we assume that a # 0.

THEOREM 4.3           For alla,
                            b,c EZ

a) lla and a0.                                  b) [(a|b) A (bla)] > a =+b.
                           c) [(a\b) A (b|c)] > ale.                       d) alb => a|bx for all x € Z.
                           e) Ifx = y + z, for some x, y, z € Z, and a divides two of the three integers x, y, and z,
                              then a divides the remaining integer.
                           f) [(a]b) A (alc)] = al(bx + cy), for all x, y € Z. (The expression bx + cy is called a
                              linear combination of b, c.)
                           g) For 1 <i <n, let c; € Z. If a divides       each   c¢;, then a|(cyx; + cox. +--+ + ¢)X%p),
                              where x; € Z for all 1 <i <n.
                      Proof: We prove part (f ) and leave the remaining parts for the reader.
                            If al|b and ajc, then b = am     and c = an, for some m,n       € Z. So bx + cy = (am)x +
                      (an)y = a(mx + ny) (by the Associative Law of Multiplication and the Distributive Law
                      of Multiplication over Addition — since the elements in Z satisfy both of these laws). Since
                      bx +cy = a(mx + ny), with mx + ny € Z, it follows that a|(bx 4+ cy).

We find part (g) of the theorem useful when we consider the following question.

Do there exist integers x, y, z (positive, negative, or zero) so that 6x + 9y + 15z = 107?
  EXAMPLE 4.23
                      Suppose that such integers did exist. Then since 3/6, 3|9, and 3/15, it would follow from
                      part (g) of Theorem 4.3 that 3 is a divisor of 6x + 9y + 15z and, consequently, 3 is a divisor
                      of 107 —-but this is not so. Hence there do not exist such integers x, y, z.

Several parts of Theorem 4.3 help us in the following
222         Chapter 4 Properties of the Integers: Mathematical Induction

Let a, b € Z so that 2a + 3b is a multiple of 17. (For example, we could have a = 7 and
      EXAMPLE 4.24
                             b = 1 —and a = 4, b = 3 also works.) Prove that 17 divides 9a + 5b.
                             Proof: We    observe     that 17|(2a + 3b) =      17|(—4)(2a + 3b), by part (d) of Theorem       4.3.
                             Also, since 17/17, it follows from part (f) of the theorem that 17|(17a + 17b). Hence,
                              17|[(17a + 17b) + (—4)(2a + 35)], by part (e) of the theorem. Consequently, as [(17a +
                             17b) + (—4)(2a 4+ 3b)] = [7 — 8)a + (17 — 12)b] = 9a + 5b, we have 17|(9a + 5b).

Using this binary operation of integer division we find ourselves in the area of mathe-
                             matics called number theory, which examines the properties of integers and other sets of
                             numbers. Once considered an area of strictly pure (abstract) mathematics, number theory is
                             now an essential applicable tool — especially, in dealing with computer and Internet secu-
                             rity. But for now, as we continue to examine the set Z* further, we notice that for all n € Zt
                             where n > 1, the integer n has at least two positive divisors, namely, 1 and n itself. Some
                             integers, such as 2, 3, 5, 7, 11, 13, and 17 have exactly two positive divisors. These inte-
                             gers are called primes. All other positive integers (greater than 1 and not prime) are called
                             composite. An immediate connection between prime and composite integers is expressed
                             in the following lemma.

LEMMA 4.1                     Ifn € Z* and n is composite, then there is a prime p such that p|n.
                             Proof: If not, let S be the set of all composite integers that have no prime divisor(s). If S 4 @,
                             then by the Well-Ordering Principle, S has a least element m. But if m is composite, then
                             m = mymy,      where m,, m2 € Zt with 1 <m,            <m      and 1 < mz <m.   Since m, ¢ S, my is
                             prime or divisible by a prime —-so, there exists a prime p such that p|m,. Since m = mymp,
                             it now follows from part (d) of Theorem 4.3 that p|m, and so S = @.

Now why did we call the preceding result a Jemma instead of a theorem? After all, it had
                             to be proved like all other theorems in the book so far. The reason is that although a lemma
                             is itself a theorem, its major role is to help prove other theorems.

In listing the primes we are inclined to believe that there are infinitely many such num-
                             bers. We now verify that this is true.

THEOREM 4.4                  (Euclid) There are infinitely many primes.
                             Proof: If not, let p;, p2,...,     px be the finite list of all primes, and let B = p;p2--- py +1.
                             Since B > p, for all 1 <i <k, B cannot be a prime. Hence B is composite. So by Lemma
                             4.1   there is a prime    p;, where    1 < j <k    and p;|B.    Since p;|B   and p;|pip2---   pe, by
                             Theorem 4.3(e) it follows that p;|1. This contradiction arises from the assumption that
                             there are only finitely many primes; the result follows.

Yes, this is the same Euclid from the fourth century B.C. whose Elements, written on 13
                             parchment scrolls, included the first organized coverage of the geometry we studied in high
                             school. One finds, however, that these 13 books are also concerned with number theory. In
                             particular, Books VII, VII, and [X dwell on this topic. The preceding theorem (with proof)
                             is found in Book IX.
                                                              4.3, The Division Algorithm: Prime Numbers            223

We turn now to the major idea of this section. This result enables us to deal with nonzero
                 division in Z when that division 1s not exact.

THEOREM 4.5      The Division Algorithm.     If a, b € Z, with b > 0, then there exist unique g,r € Z with
                 a=qb4+r,0<r<b.
                 Proof: If b|a the result follows with r = 0, so consider the case where b / a (that is, b does
                 not divide a).
                     Let  S = {a —tbh|t ¢ Z,a —tb > 0}. Ifa > Oandt = 0, thena € S$ and $ # Y. Fora <
                 0, lett =a —1. Thena —th=a~—(a— 1)b=a(1—b) +5, with (1 — b) < 0, because
                 b>1.Soa—tb>Oand S # J. Hence, for any a € Z, S is anonempty subset of Z*. By
                 the Well-Ordering    Principle,   S$ has a least element r, where     0 <r    = a — qb,    for some
                 q € Z.\fr = b,thena = (g + 1)band bla, contradicting b / a. Ifr > b,thenr = b +c, for
                 somec€ Z*, anda -—qb=r=b+c>3c=a-—(q+l)be S, contradicting r being the
                 least element of S. Hence, r < b.
                    This now establishes a quotient g and remainder r, where 0 < r < b, for the theorem. But
                 are there other q’s and r’s that also work? If so, let g;, g2, 11, r2 © Zwitha      = g,)b +17),    for
                 O<r,     <b, anda =qb4+n,          forO0<r. <b. Then gyb4+r,; = qb+nmn=> bla; — @| =
                 lr2 — r}| < b, because 0 <1), ro < b. If)      # qo, we have the contradiction b|g, — q2| < b.
                 Hence g; = 42, ’; = rz, and the quotient and remainder are unique.

As we mentioned in the preceding proof, when a, b € Z with b > 0, then there exists a
                 unique guotient q and a unique remainder r where a = qb +r, withO <r <b. Further-
                 more, under these circumstances, the integer b is called the divisor while a is termed the
                 dividend,

a) When a = 170 and 6 = 11 in the division algorithm, we find that 170 = 15-11                   +5,
  EXAMPLE 4.25
                        where 0 < 5 < 11. So when 170 is divided by 11, the quotient is 15 and the remainder
                        is 5.
                   b) If the dividend is 98 and the divisor is 7, then we find that 98 = 14 - 7. So in this case
                        the quotient is 14 and the remainder is 0, and 7 (exactly) divides 98.
                   c) For the case of a = —45 and b = 8 we have —45 = (—6)8+ 3, where 0 <3 < 8.
                      Consequently, the quotient is —6 and the remainder is 3 when the dividend is —45 and
                      the divisor is 8.
                   d) Leta, be Zt.

1) Ifa = qb for some g € Z*, then —a = (—q)b. So, in this case, when —a (< 0) is
                           divided by b (> 0) the quotient is —g (< 0) and the remainder is 0.
                        2) If a=qb+r        for some geEN        and 0<r<b,       then —a=(-qg)b-r=
                           (-q)b —b+b—r=(—q -—1)b+(b~—r). For this case, when —a (<0) is
                           divided by b (> 0) the quotient is —g — 1 (<0) and the remainder is b—,r,
                           where  0 < b—r <b.

Despite the proof of Theorem 4.5 and the results in Example 4.25, we really do not have
                 any systematic way to calculate the quotient g and remainder r when we divide an integer a
                 (the dividend) by the positive integer b (the divisor). The proof of Theorem 4.5 guarantees
                 the existence of such integers g and r, but the proof is not constructive. It does not appear to
                 tell us how to actually calculate g and r, and it does not mention anything about the ability
                 to use multiplication tables or perform long division. To remedy this situation we provide
224         Chapter 4 Properties of the Integers: Mathematical Induction

the procedure (written in pseudocode) in Fig. 4.10. Our next example illustrates the idea
                             presented in part of this procedure.

procedure          IntegerDivision              (a, b:   integers)
                                                   begin
                                                       if   a=0      then
                                                            begin
                                                               quotient      :=0
                                                               remainder      :=0
                                                            end
                                                       else
                                                            begin
                                                               r:=abs(a)
                                                                  {the absolute value of a}
                                                               gq :=0
                                                               whiler > bdo
                                                                begin
                                                                        r:=r-b
                                                                        Gq:=qil
                                                                  end
                                                               if a>  0 then
                                                                  begin
                                                                        guotient    :=q
                                                                        remainder    :=r
                                                                  end
                                                               elseif       r=0    then
                                                                  begin
                                                                     quotient :=-q
                                                                     remainder :=0
                                                                    end
                                                               else
                                                                  begin
                                                                        quotient          :=-q-1
                                                                        remainder    :=b-r
                                                                    end
                                                            end
                                                    end

Figure 4.10

Just as the multiplication of positive integers may be viewed as repeated addition, so too
      EXAMPLE 4.26
                              can we view (integer) division as repeated subtraction. We see that subtraction does play a
                              role in the definition of the set S in the proof of Theorem 4.5.
                                  When calculating 4 - 7, for example, we can think in terms of repeated addition and write

2-7=747=14
                                          3-7=(241)-7=2-741-7=(747)4+7=
                                                                  1447 =21
                                          4-7=(341)-7=3°741-7=
                                                           (74747 +7 = 2147 = 28.
                              If, on the other hand, we wish to divide 37 by 8, then we should think of the quotient g as the
                              number of 8’s contained in 37. When each one of these 8’s is removed (that is, subtracted)
                                                               4.3 The Division Algorithm: Prime Numbers           225

and no other 8 can be removed without giving us a negative result, then the integer that 1s
               left (remaining) is the remainder r. So we can calculate g and r by thinking in terms of
               repeated subtraction as follows:

37 —8 = 29> 8,
                                                29 — 8 = (37 — 8) —8 = 37 -2-8            =21>8,
                                         21-8    = ((37 — 8) — 8) —8 = 37 -3-8=13>8,
                                 13 ~ 8 = (((37 — 8) — 8) — 8)         —-8 = 37 -4-8      =5 <8.
               The last line shows that four 8’s can be subtracted from 37 before we obtain a nonnegative
               result
                  — namely, 5 — that        is smaller than 8. Therefore, in this example we have g = 4 and
               r=5.

Using the division algorithm, we consider some results on representing integers in bases
               other than 10.

Write 6137 in the octal system (base 8). Here we seek nonnegative integers rp, 71, r2, ... ,
EXAMPLE 4.27
               r,, withO < rg < 8, such that 6137 = (7; -- + rariro)s.
                    With 6137   =rot+r    -S+tr-8?+---4r,-              8   =r+    8(r,   +r   -8t---      +r,   + 8k),
               ro is the remainder obtained in the division algorithm when 6137 is divided by 8.
                    Consequently, since 6137 = 1 + 8 - 767, we have rp = 1 and 767 =r; +7r2-8+---+
               r,    8-1 =r, + 80 +73 -84--- +7, -8'-?). This yields rj = 7 (the remainder when
               767 is divided by 8) and 95 = rp +7r3-8+--- +r, -8*~*. Continuing in this manner, we
               findr2 =7, r3 = 3, rq = 1, andr;        = 0 for alli > 5, so

6137 =1-8°4+3-8°+7-8°+7-841 = (1377):
                    We can arrange the successive divisions by 8 as follows:

Remainders
                                                   8 16137

8 |767 — 1(ro)
                                                       8 195        7(r1)
                                                       8 [il        7(r2)
                                                        8 [1 — 3¢r3)
                                                           0        I(r4)

In the field of computer science, the binary number system (base 2) is very important.
EXAMPLE 4.28
               Here the only symbols that one may use are the bits 0 and 1. In Table 4.3 we have listed the
               binary representations of the (base-10) integers from 0 to 15. Here we have included leading
               zeros and find that we need four bits because of the leading 1 in the representations for the
               integers from 8 to 15. With five bits we can continue up to 31 (= 32 — 1 = 2° — 1); six bits
               are necessary to proceed to 63 (= 64 — 1 = 2° — 1). In general, if x €¢ Z and 0 < x < 2",
               for n € Zt, then we can write x in base 2 by using n bits. Leading zeros appear when
               O<x     <2""!—     1, and for 2"~! <x    <2”    —1   the first (most significant) bit is 1.
                  Information is generally stored in machines in units of eight bits called bytes, so for
               machines with memory cells of one byte we can store in a single cell any one of the binary
226   Chapter 4 Properties of the Integers: Mathematical Induction

Table 4.3

Base 10          Base 2         Base 10          Base 2

0         0000                       8        1000
                                                       1         0001                      9         1001
                                                       2         0010                     10         1010
                                                       3         0011                     11         1011
                                                       4         0100                     12         1100
                                                       5         0101                     13         1101
                                                       6         0110                     14         1110
                                                       7         O111                     15         1111

equivalents of the integers from 0 to 2° — 1 = 255. For a machine with two-byte cells, any
                        one of the integers from 0 to 2'® — 1 = 65,535 can be stored in binary form in each cell. A
                        machine with four-byte cells would take us up to 2°? — | = 4,294,967,295.
                            When a human deals with long sequences of 0’s and 1’s, the job soon becomes very
                        tedious and the chance for error increases with the tedium. Consequently, it is common (es-
                        pecially in the study of machine and assembly languages) to represent such long sequences
                        of bits in another notation. One such notation is the hexadecimal (base-16) notation. Here
                        there are 16 symbols, and because we have only 10 symbols in the standard base-10 system,
                        we introduce the following six additional symbols:

A      (Alfa)          C        (Charlie)       E_     (Echo)
                                                B      (Bravo)         D        (Delta)         F     (Foxtrot)

In Table 4.4 the integers from 0 to 15 are given in terms of both the binary and the hexadec-
                        imal number systems.

Table 4.4
                                      Base 10          Base 2        Base 16      ~—— Base 10        Base 2       Base 16

0           0000                 0               8         1000            8
                                          1           0001                  l              9         1001            9
                                          2           0010                 2              10         1010           A
                                          3           0011                 3              11         1011            B
                                          4           0100                 4              12         1100            Cc
                                          5           0101                 5              13         1101            D
                                          6           0110                 6              14         1110            E
                                          7           0111                 7              15         1111            F

To convert from base 10 to base 16, we follow a procedure like the one outlined in Example
                        4.27. Here we are interested in the remainders upon successive divisions by 16. Therefore,
                        if we want to represent the (base-10) integer 13,874,945 in the hexadecimal system, we do
                        the following calculations:
                                                                         4.3 The Division Algorithm: Prime Numbers   227

Remainders
                                           16 | 13,874,945
                                               16 | 867,184                         1           (70)
                                                16         |54,199                  0           (71)
                                                  16 |3,387                         7           (r2)
                                                       16 [211                  11(=B)_         (73)
                                                           16 [13                   3           (ra)
                                                                         0      13(=D)          (7s)

Consequently, 13,874,945 = (D3B701)\6.
                  There is, however, an easier approach for converting between base 2 and base 16. For
               example, if we want to convert the binary (one-byte) integer 01001101 to its base-16 coun-
               terpart, we break the number into blocks of four bits:

0100
                                                             ——               —
                                                                               1101
                                                                 4              D

We then convert each block of four bits to its base-16 representation (as shown in Table 4.4),
               and we have (01001101). = (4D) 16. If we start with the (two-byte) number (A13F)j6 and
               want to convert in the other direction, we replace each hexadecimal symbol by its (four-bit)
               binary equivalent (also as shown in Table 4.4):

A                   1           3        F
                                                     ma,      eve nmen,        ao,      Pout
                                                1010          0001             OO11     41111

This results in (A13F)j¢ = (1010000100111111)>.

We need negative integers in order to perform the binary operation of subtraction in terms of
EXAMPLE 4.29
               addition [that is, (a — b) = a + (—b)]. When we are dealing with the binary representation
               of integers, we can use a popular method that enables us to perform addition, subtraction,
               multiplication, and (integer) division: the two’s complement method. The method’s popu-
               larity rests on its implementation by only two electronic circuits   — one to invert and the
               second to add.
                   In Table 4.5 the integers from —8 to 7 are represented by the four-bit patterns shown.
               The nonnegative integers are represented as they were in Tables 4.3 and 4.4. To obtain the
               results for —8 <n < —1, first consider the binary representation of |n|, the absolute value
               of n. Then do the following:

1) Replace each 0(1) in the binary representation of |n| by 1(0); this result is called the
                     one’s complement of (the given representation of ) |n|.
                  2) Add   1 (= 0001   in this case) to the result in step (1). This result is called the two's
                     complement of n.
                  For example, to obtain the two’s complement (representation) of —6, we proceed as
               follows.
228         Chapter 4 Properties of the Integers: Mathematical Induction

6
                                 1) Start with the binary                                    1
                                     representation of 6.                              0110
                                 2) Interchange the 0’s and 1’s; this                   1
                                    result is the one’s complement of 0110.            1001
                                 3) Add 1 to the prior result.                           1
                                                                                        1001 + 0001 = 1010
                                 We can also obtain the four-bit patterns for the values —8 <n < —1 by using the four-
                             bit patterns for the integers from 0 to 7 and complementing (interchanging 0’s and 1’s) these
                             patterns as shown by four such pairs of patterns in Table 4.5. Note in Table 4.5 that the
                             four-bit patterns for the nonnegative integers start with 0, whereas | is the first bit for the
                             negative integers in the table.

Table 4.5

Two’s Complement Notation

Value Represented            Four-Bit Pattern

7          O         1           1     1 «—
                                                                    6          Oo        1           1    0
                                                                    5          0         1           O     1 <—
                                                                   4           0         1          O     0
                                                                    3          0        oO           1     1
                                                                    2          0        0             |   0
                                                                    1          0        0            O     1
                                                                    0          0        0            O    0 ‘
                                                                 —1             1        1           1     I
                                                                 —2             1        1           1    0
                                                                 —3             1        1        QO       1
                                                                 —4             1        1           O    0)
                                                                 —5             1       oO           1     1
                                                                 —6             1       Oo           1    0 <«—
                                                                 —7             1       0            O     1
                                                                 —8             1       0        0O       0 «—

EXAMPLE 4.30           How do we perform the piahaanen 33 — 15 in base 2, using the two’s complement method
                             with patterns of eight bits (= one byte)?
                                We want to determine 33 — 15 = 33 + (—15). We find that 33 = (00100001)>, and 15 =
                             (00001111).. Therefore we represent —15 by

11110000 + 00000001 = 11110001.

The addition of integers represented in two’s complement notation is the same as ordinary
                             binary addition, except that all results must have the same size bit patterns. This means that
                             when two integers are added by the two’s complement method, any extra bit that results on
                             the left of the answer (by a final carry) must be discarded. We illustrate this in the following
                             calculations.
                                                       4.3. The Division Algorithm: Prime Numbers           229

00100001
                                      — 15                     + 11110001
                                                                 100010010
                                                                  ee poe
                                                               Answer = (00010010), = 18
                                         This bit is
                                        discarded.                       *      This bit indicates that
                                                                               the answer is nonnegative.

To find 15 — 33 we use 15 = (00001111)2 and 33 = (00100001)2. Then, to calculate
15 — 33 as 15 + (—33), we represent —33 by 11011110  + 00000001 = 11011111. This
gives us the results
                                        15                       00001111
                                      — 33                     + 11011111
                                                                 11101110
                                                                  t___ This bit indicates that
                                                                       the answer is negative.

In order to get the positive form of the answer, we proceed as follows:

11101110
   1) Take the one’s         +
      complement.        00010001
   2) Add | to the            +
      prior result.      00010010
Since (00010010). = 18, the answer is —18.

One problem we have avoided in the two preceding calculations involves the size of the
integers that we can represent by eight-bit patterns. No matter what size patterns we use,
the size of the integers that can be represented is limited. When we exceed this size, an
overfiow error results. For example, if we are working with eight-bit patterns and try to add
117 and 88, we obtain
                                       117                        01110101
                                  +     88                      + 01011000
                                                                  11001101
                                                                   *     This bit indicates that
                                                                         the answer is negative.

This result shows how we can detect an overflow error when adding two numbers. Here
an overflow error is indicated: The sum of the eight-bit patterns for two positive integers
has resulted in the eight-bit pattern for a negative integer. Similarly, when the addition of
(the eight-bit patterns of) two negative integers results in the eight-bit pattern of a positive
integer, an overflow error is detected.

To see why the procedure in Example 4.30 works in general, let x, y € Z* with x > y.
   Let 2”~! < x < 2”. Then the binary representation for x is made up of n bits (with the
leading bit 1). The binary representation for 2” consists of n + 1 bits: aleading bit 1 followed
by n 0’s. The binary representation for 2” — 1 consists ofx 1’s.
   When we subtract y from 2” — 1, we have

(2"” —1)—y=11...1-—y,                     the one’s complement of y.
                                              nl’s
230            Chapter 4 Properties of the Integers: Mathematical Induction

Then (2” — 1) — y + 1 gives us the two’s complement of —y, and

x—y=x+[(Q2"-1)-y4+1]-2"’,
                                      where the final term, —2”, results in the removal of the extra bit that arises on the left of the
                                      answer.

We close this section with one final result on composite integers.

EXAMPLE 4.31              _|    If n € Z* and n           is composite, then there exists a prime p such that p|n and p < /n.
                                      Proof: Since ” is composite, we can write » = n\n2, where 1 <n, <n and 1 <n <n.
                                      We claim that one of the integers n;,2 must be less than or equal to Jn. If not, then
                                      n, > ./n and n2 > ./n give us the contradiction n = njn2 > (./n)(./n) = n. Without loss
                                      of generality, we shall assume that n, < /n. If n; is prime, the result follows. If nj is not
                                      prime, then by Lemma 4.1 there exists a prime p <n; where p|n;. So p|n and p < ./n.

c)a=0,        b=42              d) a = 434,         b=3)
                          EXERCISES 4.3
                                                                                   13. Ifn EN, prove that 3|(7” — 4”).
1. Verify the remaining parts of Theorem 4.3.                                     14, Write each of the following (base-10) integers in base 2,
  2. Let a,b,c,d€Z*.        Prove that (a) [(a]b) A (cld)] =>                      base 4, and base 8.
ac\|bd; (b) a|b => ac|bc; and (c) ac|bc = alb.                                          a) 137                b) 6243                  c) 12,345
3. If p, g are primes, prove that p|q if and only if p = q.                       15. Write each of the following (base-10) integers in base 2 and
4. Ifa, b, c€ Z* and albc, does it follow that a|b or alc?                        base 16,

5. For all integers a, b, and c, prove that ifa / bc, thena J b                        a) 22           b) 527           c) 1234           d) 6923
anda jc.                                                                           16. Convert each of the following hexadecimal numbers to base
6. Let n €Z*      where n>2.                Prove that if a), a,..-.. Qn,         2 and base 10.
bi, bo,...,b,
           €Z*            and        a,|b,      for   all    1<i<n,         then       a) A7            b) 4C2           c)   1C2B         d) A2DFE
(az ++ - An)|(b1b2 + + + By).                                                      17. Convert each of the following binary numbers to base 10
7.   a) Find three positive integers a, b, c such that                            and base 16.
      31|(S5a + 7b + 11c).                                                             a) 11001110                      b) 00110001
      b) If   a,b,ce€Z     and       31\(5a+7b+4+           11c),   prove   that        c) 11110000                     d) 01010111
      31\(21a + 176+ 9c).
                                                                                   18. For what base do we find that 251 + 445 = 1026?
  8. Agrocery store runs a weekly contest to promote sales. Each
                                                                                   19, Find all n € Z* where n divides 5n + 18.
customer who purchases more than $20 worth of groceries re-
ceives a game card with 12 numbers on it; if any of these num-                     20. Write each of the following integers in two’s complement
bers sum to exactly 500, then that customer receives a $500                        representation. Here the results are eight-bit patterns.
shopping spree (at the grocery store). After purchasing $22.83
                                                                                       a) 15                  b) -—15                  ¢) 100
worth of groceries at this store, Eleanor receives her game card
on which are printed the following 12 numbers: 144, 336, 30,                           d) —65                 e) 127                   f) —128
66, 138, 162, 318, 54, 84, 288, 126, and 456. Has Eleanor won                      21. If a machine stores integers by the two’s complement
a $500 shopping spree?                                                             method, what are the largest and smallest integers that it can
9, Let a,b €Z*.      If bla and b\(a +2), prove that b = 1 or                     store if it uses bit patterns of (a) 4 bits? (b) 8 bits? (c) 16 bits?
                                                                                   (d) 32 bits? (e) 2" bits, n € Z*?
b=2,
10. Ifn € Z*, and n is odd, prove that 8|(n? — 1).                                 22. In each of the following problems, we are using four-bit
                                                                                   patterns for the two’s complement representations of the inte-
11. If a, b € Z*, and both are odd, prove that 2|(a? + 6?) but
                                                                                   gers from —8 to 7. Solve each problem (if possible), and then
4} (a? +b’).
                                                                                   convert the results to base 10 to check your answers. Watch for
12. Determine the quotient g and remainder r for each of the                       any overflow errors.
following, where a is the dividend and b is the divisor.
                                                                                       a)     0101                      b)      1101
      a)a=23,       b=7                       b)a=-—-115,           b=12                    + 0001                            + 1110
                                                                   4.4   The Greatest Common Divisor: The Euclidean Algorithm           231

ce)      O11)                    d)     1101                           28.   Define the set X C Z* recursively as follows:
           + 1000                         + 1010
                                                                                 1) 3€   X; and
23. Ifa, x, y € Z, anda     # 0, prove thatax =ay > x       = y.
                                                                                 2) Ifa, be X,thena +be         Xx.
24. Write acomputer program (or develop an algorithm) to con-
vert a positive integer in base 10 to base 6, where 2 < b < 9.             Prove that X = {3k|k € Z*}, the set of all positive integers di-
25. The Division Algorithm can be generalized as follows:                  visible by 3.
For a,b¢€Z,b #0, there exist unique g,r €Z with a=                         29. Letn
                                                                                  € Z* withn =r, - 10 +---+75-10?
                                                                                                                +7, - 104 79
qb+r,0<r < |b|. Using Theorem 4.5, verify this generalized                 (the base-10 representation of n), Prove that
form of the algorithm for b < 0.
                                                                                 a) 2|n if and only if 2|ro
26. Write a computer program (or develop an algorithm) to
convert a positive integer in base 10 to base 16.                                b) 4|n if and only if 4|(r; - 10+ 79)
27. For n € Z*, write a computer program (or develop an al-                      c) 8|z if and only if 8|(r2 - 10? +7, - 10 +79)
gorithm) that lists all positive divisors of n.                            State a general theorem suggested by these results.

4.4
           The Greatest Common Divisor:
                The Euclidean Algorithm
                               Continuing with the division operation developed in Section 4.3, we turn our attention to
                               the divisors of a pair of integers.

Definition 4.2       For a, b € Z, a positive integer c is said to be a common divisor of a and b if cla and c|b.

EXAMPLE          4.32      theoumon            divisors of 42 and 70 are 1, 2,7, and 14, and 141s the greatest of the common

Definition 4.3        Let a, b € Z, where either a 4 O orb # 0. Thenc                   € Z? is called a greatest common divisor
                               of a, bif

a) cla and c|b (that is, c is a common divisor of a, b), and
                                    b) for any common divisor d ofa and b, we have dc.

The result in Example 4.32 satisfies these conditions. That is, 14 divides both 42 and 70,
                               and any common divisor of 42 and 70 — namely, 1, 2, 7, and 14— divides 14. However, this
                               example deals with two small integers. What would we do with two integers each having
                               20 digits? We consider the following questions.

1) Given a, b € Z, where at least one of a, b is not 0, does a greatest common                  divisor
                                          of a and } always exist? If so, how does one find such an integer?
                                    2) How many greatest common divisors can a pair of integers have?
                                    In dealing with these questions, we concentrate on a, b € Zt.

THEOREM 4.6                    For all a, b € Z", there exists a unique c € Z* that is the greatest common                  divisor of a, b.
                               Proof: Givena, b < Z*, let S = {as + bt|s, t € Z, as + bt > 0}. Since S # @, by the Well-
                               Ordering Principle S has a least element c. We claim that c is a greatest common divisor of
                               a, b.
232         Chapter 4 Properties of the Integers: Mathematical Induction

Since c € S,c = ax + by, for some x, y € Z. Consequently, ifd € Z and dla and d|b,
                             then by Theorem 4.3(f) d|(ax + by), so dlc.
                                 If c { a, we can use the division algorithm to write a = gc +r, withg, r € Z* and0 <
                             r<c.Thenr =a—qce=a-—gq(ax + by) = (1 —qx)a+ (—qy)b,sor € S, contradicting
                             the choice of c as the least element of S. Consequently, c|a, and by a similar argument, c|b.
                                 Hence all a, b € Z* have a greatest common divisor. If c,, cz both satisfy the two con-
                             ditions of Definition 4.3, then with c; as a greatest common               divisor, and cz as a common
                             divisor, it follows that c2|c,. Reversing roles, we find that c)|c2, and so we conclude from
                             Theorem 4.3(b) that c) = co because c), co € Zt.

We now know that for all a, b € Z*, the greatest common divisor of a, b exists — and it
                              is unique. This number will be denoted by gcd(a, b). Here gcd(a, b) = ged(b, a); and for
                             each a € Z, ifa # 0, then gcd(a, 0) = |a|. Also when a, b € Zt, we have gcd(—a, b) =
                             gcd(a, —b) = gcd(—a, —b) = ged(a, b). Finally, gcd(0, 0) is not defined and is of no in-
                             terest to us.
                                 From Theorem 4.6 we see that not only does gcd{a, b) exist but that gcd(a, b) is also
                             the smallest positive integer we can write as a linear combination of a and b. However,
                             we must realize that if a, b,c € Z*              and c = ax + by for some x, y € Z, then we do not
                             necessarily know that c is gcd(a, b) — unless we somehow also know that c is the smallest
                             positive integer that can be written as such a linear combination of a and b.
                                 Finally, integers a and b are called relatively prime when gcd(a, b) = 1 —that is, when
                             there exist x, y € Z with ax + by = 1.

Since gced(42, 70) = 14, we can find x, y € Z with 42x + 70y = 14, or 3x +5y = 1. By
      EXAMPLE 4.33
                             inspection,x = 2, y = —1 is asolution; 3(2) + 5(—1) = 1. But fork € Z, 1 = 3(2 — 5k) +
                             5(—1 + 3k), so 14 = 42(2 — 5k) + 70(~—1 + 3k), and the solutions for x, y are not unique.
                                In general, if gcd(a, b) = d, then gcd({a/d), (b/d)) = 1. (Verify this!) If (a/d)xo +
                             (b/d)yo = 1, then 1 = (a/d)(% — (b/d)k) + (b/d)(yo + (a/d)k), for each k € Z. Sod =
                             a(xo — (b/d)k) + b(yo + (a/d)k), yielding infinitely many solutions to ax + by = d.

The preceding example and the prior observations work well enough when a, b are
                             fairly small. But how does one find ged(a, b) for some arbitrary a, b € Z*? If alb, then
                             gcd(a, b) = a; and if bla, then gcd(a, b) = b — otherwise, we turn to the following result,
                             which we owe to Euclid.

THEOREM 4.7                  Euclidean Algorithm. Leta, b € Z*. Setry = a andr; = band apply the division algorithm
                             n times as follows:
                                                          ro   =   git,   +f,              Q<r<ry

ry =qro+nrs,                     Q0<7r3<1r2
                                                          ro = q3r3+7r4,                   O<1r4< 13

Me = Gi4ihig +142,               O< rig. <Vi41

Yn—-2 = Gn-1hn-1       +T np     O< ry   <Pry-]

Pu-1     = Gnlpn.

Then r,, the last nonzero remainder, equals gcd{a, b).
                                               4.4 The Greatest Common Divisor: The Euclidean Algorithm        233

Proof: To verify that r, = gcd(a, b), we establish the two conditions of Definition 4.3.
                  Start with the first division process listed (where ro = a and r; = b). If c|ro and clr),
               then as ro = gir) +72,      it follows that clr..   Next [(clr;) A (clr2)] > clr3,    because   r; =
               gor2 + r3. Continuing down through the division processes, we get to where c|r,_2 and
               c|rn;-1. From the next-to-last equation, we conclude that c|r,, and this verifies condition
               (b) of Definition 4.3.
                  To establish condition (a) we go in reverse order. From the last equation, r;,|r,—;, and
               SO rp |rn—2, because ryz-2 = Gn—1%n—| + Pn. Continuing up through the equations, we get to
               where r,|rg and r,|73, SO r,|ro. Then [(7,|73) A (fnlr2)] > ral (that is, r,|b), and finally
               [(rnl72) A (nlr1)] => ralro, (that is, 7,|a). Hence r, = ged(a, 5).

We have now used the word algorithm in describing the statements set forth in Theorems
               4.5 and 4.7. This term will recur frequently throughout other chapters of this text, so it may
               be a good idea to consider just what it connotes.
                   First and foremost, an algorithm is a list of precise instructions designed to solve a
               particular type of problem   — not just one special case. In general, we expect all of our
               algorithms to receive input and provide the needed result(s) as output. Also, an algorithm
               should provide the same result whenever we repeat the value(s) for the input. This happens
               when the list of instructions is such that each intermediate result that comes about from the
               execution of each instruction is unique, depending on only the (initial) input and on any
               results that may have been derived at any preceding instructions. In order to accomplish
               this any possible vagueness must be eliminated from the algorithm; the instructions must
               be described in a simple yet unambiguous manner, a manner that can be executed by a
               machine. Finally, our algorithms cannot go on indefinitely. They must terminate after the
               execution of a finite number of instructions.
                   In Theorem 4.7 we are confronted with the determination of the greatest common divisor
               of any two positive integers. Hence this algorithm receives the two positive integers a, b
               as its input and generates their greatest common divisor as the output.
                   The use of the word algorithm in Theorem 4.5 is based on tradition. As stated, it does not
               provide the precise instructions we need to determine the output we want. (We mentioned
               this fact prior to Example 4.26.) To eliminate this shortcoming of Theorem 4.5, however,
               we set forth the instructions in the pseudocode procedure of Fig. 4.10.

We now apply the Euclidean algorithm in the following five examples.

Find the greatest common divisor of 250 and 111, and express the result as a linear combi-
EXAMPLE 4.34
               nation of these integers.

250 =2(111) +28,              0O<28<111
                                           111 = 3(28) +27,              0 <27 <28
                                            28 = 1(27) +1,               0<1<27
                                            27 = 27(1) +0.
                  So 1 is the last nonzero remainder. Therefore gcd(250, 111) = 1, and 250 and 111 are
               relatively prime. Working      backward from the third equation, we have          1 = 28 — 1(27) =
               28 — 1[111 — 3(28)] = (-Dd1)) + 4(28) = (—1) 111) 4+ 4[250 — 2(111)] = 4(250) —
               9(111) = 250(4) + 111(—9), a linear combination of 250 and 111.
                  This expression of 1 as a linear combination of 250 and 111 is not unique, for 1 =
               250[4 — 111k] + 111[—9 + 250k], for any k € Z.
234         Chapter 4 Properties of the Integers: Mathematical Induction

We also have

gcd(—250,     111) = ged(250,             —111)    = ged(—250,   —111)   = ged(250,   111) = 1.

Our next example is somewhat more general, as it concerns the greatest common divisor
                             for an infinite number of pairs of integers.

For any n € Z*, prove that the integers 87 + 3 and 5n + 2 are relatively prime.
      EXAMPLE 4.35
                                 When n = 1 we find that gcd(8n + 3, 5n + 2) = ged(11, 7) = 1.
                                 For n > 2 we have 8n + 3 > 5n + 2, and as in the previous example, we may write

8n +3           =1(5n+2)4+B3n+1),                 O<3n+1<5n42
                                                5n+2        =   13n+1)+(Qn4+1),                   0<2n+1<3n4+1
                                                3n+1=1(2n4+                  1) +n,               O<n<2n4+]
                                                2n+1=2(n)+4+1,                                    O<l<n
                                                        n=n(1)+0.

Consequently, the last nonzero remainder is 1, so gcd(8n + 3, 5n + 2) = 1                     for all n > 1.
                             But we could also have arrived at this conclusion if we had noticed that

(8n + 3)(—5) + (Sn + 2)(8) = -154+ 16 = 1.

And since | is expressed as a linear combination of 8n +3 and 5n + 2, and no smaller
                             positive integer can have this property, it follows that the greatest common divisor of
                             8n + 3 and 5n + 2 is 1, for any positive integer n.

At this point we shall use the Euclidean algorithm to develop a procedure (in pseudocode)
      EXAMPLE 4.36
                             that will find ged(a, b) for all a, b € Z*. The procedure in Fig. 4.11 employs the binary
                             operation mod, where for x, y € Z*, x mod y = the remainder after x is divided by y. For
                             example, 7 mod 3 is 1, and 18 mod 5 is 3. (We shall deal with “the arithmetic of remainders”
                             in more detail in Chapter 14.)

procedure          gcd(a,b:           positive   integers)
                                                begin
                                                   ri:i=amodb
                                                   d:=b
                                                   while        r > 0   do
                                                       begin
                                                             c:=d
                                                             d:=r
                                                            r:=cmodd
                                                       end
                                                end {gcd(a,b)            isd,         the last nonzero   remainder}

Figure 4.11

Meanwhile, if we call this procedure for a = 168 and b = 456, the procedure first as-
                             signs r the value 168 mod 456 = 168 and d the value 456. Since r > 0 the code in the
                             while loop is executed (for the first time) and we obtain the following: c = 456, d = 168,
                                                4.4   The Greatest Common Divisor: The Euclidean Algorithm   235

r = 456 mod 168 = 120. We then find that the code in the while loop is executed three
                  more times with the following results:
                     (2nd pass):    c = 168, d = 120, r = 168 mod 120 = 48
                     (3rd pass):    c=   120,d=       48,r = 120mod48          = 24
                     (4th pass):    c=    48,d=       24,r=      48mod24       =    0.

Since r is now QO, the procedure tells us that gcd(a, b) = gcd(168, 456) = 24, the final
                  value of d (the last nonzero remainder).

Griffin has two unmarked containers. One container holds 17 ounces and the other holds
   EXAMPLE 4.37
                  55 ounces. Explain how Griffin can use his two containers to measure exactly one ounce.
                      From the Euclidean algorithm we find that

55 = 3(17) +4,           0<4<17
                                                  17 =4(4) 4+ 1,           0<1<4.

Therefore 1 = 17 ~— 4(4) = 17 — 4[55 — 3(17)] = 13(17) — 4(55). Consequently, Griffin
                  must fill his smaller (17-ounce) container 13 times and empty the contents (for the first 12
                  times) into the larger container. (Griffin empties the larger container whenever it is full.)
                  Before he fills the smaller container for the thirteenth time, Griffin has 12(17) — 3(55) =
                  204 — 165 = 39 ounces of water in the larger (55-ounce) container. After he fills the smaller
                  container for the thirteenth time, he will empty 16 (= 55 — 39) ounces from this container,
                  filling the larger container. Exactly one ounce will be left in the smaller container.

Assisting students in programming classes, Brian finds that on the average he can help a
   EXAMPLE 4.38
                  student debug a Java program in six minutes, but it takes 10 minutes to debug a program
                  written in C++. If he works continuously for 104 minutes and doesn’t waste any time, how
                  many programs can he debug in each language?
                      Here we seek integers x, y>0, where         6x + 10y = 104, or 3x +5y =52.        As
                  gcd(3, 5) = 1, we can write 1 = 3(2) + 5(~1), so 52 = 3(104) + 5(—52) = 3(104 ~ 5k)
                  + 5(—52 + 3k), k € Z. In order to obtain 0 < x = 104 — 5k and 0 < y = —52+4 3k, we
                  must have (52/3) < k < (104/5). Sok = 18, 19, 20 and there are three possible solutions:

a) (kK=18):      x=14,        y=2                 b) (K=19):         x =9,      y=5
                    c) (kK=20):      x=4,       y=8

The equation in Example 4.38 is an example of a Diophantine equation: a linear equa-
                  tion requiring integer solutions. This type of equation was first investigated by the Greek
                  algebraist Diophantus, who lived in the third century A.D.
                     Having solved one such equation, we seek to discover when a Diophantine equation has
                  a solution. The proof is left to the reader.

THEOREM 4.8       Ifa, b, c € Z*, the Diophantine equation ax + by = c has an integer solution x = xo, y =
                  yo if and only if gcd(a, b) divides c.

We close this section with a concept that is related to the greatest common divisor.
236           Chapter 4 Properties of the Integers: Mathematical Induction

Definition 4.4            For a, b, c€ Z*, c is called a common multiple of a, b if c is a multiple of both a and
                                 b. Furthermore,   c is the least common       multiple of a, b if it is the smallest of all positive
                                 integers that are common multiples of a, b. We denote c by Icm(a, b).

If a, b € Z*, then the product ab is acommon multiple of both a and b. Consequently,
                                 the set of all (positive) common multiples of a, b is nonempty. So it follows from the
                                 Well-Ordering Principle that the lem(a, b) does exist.

EXAMPLE 4.39                 a) Since 12 = 3 - 4 and no other smaller positive integer is a multiple of both 3 and 4, we
                :                     have Iem(3, 4) = 12 = Icem(4, 3). However, Icem(6, 15) # 90      — for although 90 is a
                                      multiple of both 6 and 15, there is a smaller multiple, namely, 30. And since no other
                                      common multiple of 6 and 15 is smaller than 30, it follows that lem(6, 15) = 30.
                                   b) For all n € Z*, we find that Iem(1, 7) = Iem(n, 1) =n.
                                    c) Whena, n € Z*, wehave Iem(a, na) = na. [This statement is a generalization of part
                                       (b). The earlier statement follows from this one when a = 1.]
                                   d) Ifa, m,n     € Zt with m <n, then lem(a”, a”) = a”. [And gcd(a™, a”) = a]

THEOREM 4.9                      Let a, b,c € Z*, with c = lem(a, b). Ifd is acommon multiple of a and b, then cld.
                                 Proof: If not, then because of the division algorithm we can write d = gc +r,                            where
                                 0<r<c.      Since c = lcm(a, b), it follows that c = ma for some m € Z*. Also, d = na for
                                 some n € Z*, because d is a multiple of a. Consequently, na = gma +r > (n — qm)a =
                                 r > 0, and r is a multiple of a. In a similar way r is seen to be a multiple of b, so r is a
                                 common     multiple of a, b. But with 0 < r < c, we contradict the claim that c is the least
                                 common multiple of a, b. Hence cld.

Our last result for this section ties together the concepts of the greatest common divisor
                                 and the least common multiple. Furthermore, it provides us with a way to calculate Iem(a, b)
                                 for all a, b € Zt. The proof of this result is left to the reader.

THEOREM 4.10                     For all a, b € Z*, ab = Iem(a, b) - gcd(a, b).

EXAMPLE        4.40        By virtue of Theorem 4.10 we have the following:

a) For all a, b € Z", if a, b are relatively prime, then lem(a, b) = ab.
                                   b) The computations in Example 4.36 establish the fact that gcd(168, 456) = 24. As a
                                      result we find that
                                                                                       168)    (456
                                                                 Icm(168, 456) = ae           aca      = 3,192.

2. For   a,be€Z*      and    s,té€Z,     what   can   we   say    about
                            EXERCISES 4.4                               ecd(a, b) if
  1. For each of the following pairs a,b¢Z*, determine
gced(a, b) and express it as a linear combination of a, b.                   a) as + bt = 2?                    b) as + bt = 3?
      a) 231, 1820          b) 1369,2597       ce) 2689, 4001                c) as + bt =4?                     d) as + bt = 6?
                                                                                          4.5 The Fundamental Theorem of Arithmetic            237

3. Fora, b € Zt andd = gcd(a, b), prove that                                    12. Let    a,be€Z*     where   a>b.    Prove   that   gcd(a, b) =
                                                                                gcd(a — b, b).
                          af2)
                         ed{—,—]=1. 4
                        BON" d                                                  13. Prove that for any n € Z*, gcd(5n + 3, In +4)       = 1.
4, Fora, b,n € Z*, prove that gcd(na, nb) = n ged(a, b).                       14, An executive buys $2490 worth of presents for the children
5. Leta, b,c € Z* with c = ged(a, b). Prove that c*                            of her employees. For each girl she gets an art kit costing $33;
divides ab,                                                                     each boy receives a set of tools costing $29. How many presents
6. Letn € Z*.                                                                  of each type did she buy?

a) Prove that ged(n, n + 2) =         1 or 2.                               15, After a weekend at the Mohegan Sun Casino, Gary finds
    b) What possible values can gcd(n, n +3)                 have? What         that he has won $1020 — in $20 and $50 chips. If he has more
    about gcd(n, n + 4)?                                                        $50 chips than $20 chips, how many chips of each denomination
                                                                                could he possibly have?
    c) Ifk € Z*, what can we say about ged(n, n + k)?
                                                                                16. Let a,b € Z*. Prove that there exist c, d € Z* such that
7. Fora, b,c, d € Z*, prove that ifd = a + bc, then
                                                                                cd = aand ged(c, d) = b if and only if b?|a.
                      gecd(h, d) = gcd(a, b).
                                                                                17.   Determine those values of c € Z*,   10 <c   < 20, for which
8. Let a, b, c€ Z* with gcd(a, b) = 1. If alc and b\c, prove                   the Diophantine equation 84x + 990y =c has no solution.
that ab|c. Does the result hold if gcd(a, b) # 1?                               Determine the solutions for the remaining values of c.
9. Leta, b € Z, where at least one of a, b is nonzero.
                                                                                18. Verify Theorems 4.8 and 4.10.
    a) Using     quantifiers,   restate    the      definition   for    c=
                                                                                19. Ifa, b < Z* with a = 630, gcd(a, b) = 105, and
    gcd(a, b), where c € Z*.
                                                                                Iem(a, b) = 242, 550, what is b?
    b) Use     the result in part (a) in order to decide               when
                                                                                20. For each pair a, b in Exercise 1, find Iem(a, B).
    c # gcd(a, b) for some c € Z*.
10. If a, b are relatively prime and a > b, prove that                          21. For each n € Z*, what are gcd(n, n + 1) and
ged(a
  — b,a+b)           = lor?.                                                    Iem(n,2 +1)?

Ul. Leta, b, c € Z* with gcd(a, b) =         1.Ifalbe, prove that alc.          22. Prove that lem(na, nb) = n Iem(a, b) foralln,a, be Z.

45
The Fundamental Theorem of Arithmetic
                                 In this section we extend Lemma 4.1 and show that for each n € Z*,n > 1, either n is
                                 prime or 7 can be written as a product of primes, where the representation is unique up to
                                 order. This result, known               as the Fundamental      Theorem of Arithmetic, can be found in an
                                 equivalent form in Book IX of Euclid’s Elements.
                                    The following two lemmas will help us accomplish our goal.

LEMMA 4.2                        Ifa, b € Z* and p is prime, then p|ab > pla or p\b.
                                 Proof: If p|a, then we are finished. If not, then because p is prime, it follows that gcd(p, a) =
                                 1, and so there exist integers x, y with px + ay = 1. Then b = p(bx) + (ab)y, where p|p
                                 and p|ab. So it follows from parts (d) and (e) of Theorem 4.3 that p|b.

LEMMA 4.3                        Let a; € Z* for all 1 <i <n. If pis prime and plajaz - - - ay, then pla; for some 1 <i <n.
                                 Proof: We leave the proof of this result to the reader.

Using Lemma 4.2 we now have another opportunity to establish a result by the method
                                 of proof by contradiction.
238         Chapter 4 Properties of the Integers: Mathematical Induction

We want to show that s/2 is irrational.
|     EXAMPLE 4.41              If not, we can write /2 = a/b, where a, b € Z* and gced(a, b) = 1. Then /2 = a/b>
                             2 = a’/b? > 2b* = a? > 2\|a* > 2I\a. (Why?) Also, 2/a > a = 2c for some c € Z*, so
                             2b? = a® = (2c)* = 4c? and b* = 2c”. But then 2|b? > 2|b. Since 2 divides both a and
                             b, it follows that gcd(a, b) > 2 —but          this contradicts the earlier claim that ged(a, b) =
                              1. [Note: The preceding proof for the irrationality of /2 was known to Aristotle (384—
                              322 B.C.) and is similar to that given in Book X of Euclid’s Elements. ]

Before we turn to the main result for this section, let us point out that the integer 2 in
                             the preceding example is not that special. The reader will be asked to show in the Section
                             Exercises that in fact ,/p is irrational for every prime p. Now that we have mentioned this
                             fact, it is time to present the Fundamental Theorem of Arithmetic.

THEOREM 4.11                 Every integer n > 1 can be written as a product of primes uniquely, up to the order of the
                             primes. (Here a single prime is considered a product of one factor.)
                             Proof: The proof consists of two parts: The first part covers the existence of a prime factor-
                             ization, and the second part deals with its uniqueness.
                                 If the first part is not true, let m > 1 be the smallest integer not expressible as a product
                             of primes. Since m is not a prime, we are able to write m = m mp», where 1 < m, <m,
                             1 < m2 < m. But then m , m2 can be written as products of primes, because they are less
                             than m. Consequently, with m = mm we can obtain a prime factorization of m.
                                 In order to establish the uniqueness of a prime factorization, we shall use the alternative
                             form of the Principle of Mathematical Induction (Theorem 4.2). For the integer 2, we have
                             a unique prime factorization, and assuming uniqueness of representation for 3, 4,5,...,
                             n — 1, we suppose that n = pj! p;’--- pi’ = qi'q? --- qt, where each p;, 1 <i <k, and
                             each qj, 1 <j <r, is a prime. Also p; < po <-++-  < px, and qi <q. <--- <q,, and
                             s,>QOforalll <i<k,t;>Oforalll<j<r.
                                Since p,|n,we have Pilg; 4x +g}.By Lemma 4.3, p|g; forsome 1 < j <r. Because
                             p; and g; are primes, we have p; = q;. In factj = 1, for otherwise qi|n > q; = p, for
                             some 1<e<k and p; < pp = qi < qj = pi. With p; = q, we find that ny = n/p, =
                             pi | pe    pk = qi? -.- qi", Since ny <n, by the induction hypothesis it follows
                             that
                               k =r, p, = q; for 1 <i<k,s;                 -1l=t   —1   (so5s; =f),   and s; =t;   for
                                                                                                                    2 <i <k.
                             Hence the prime factorization of 7 is unique.

This result is now used in the following five examples.

For the integer 980,220 we can determine the prime factorization as follows:
      EXAMPLE 4.42
                                        980,220 = 2'(490,110) = 27(245,055) = 273'(81,685) = 273!5!(16,337)
                                                  = 2°3'5117'(961) = 27-3-5-17-31?

Suppose thatn € Z* and that
      EXAMPLE 4.43
                             (*)            10:9-8-7-6-5-4-3-2-n=21-20-19-18-17-16-15-
                                                                                     14.

Since 17 is a prime factor of the integer on the right-hand side of Eq. (*) it must also
                             be a factor for the left-hand side (by the uniqueness part of the Fundamental Theorem of
                                                             4.5   The Fundamental Theorem of Arithmetic           239

Arithmetic). But 17 does not divide any of the factors 10,9, 8,..., 3 or 2, so it follows that
                17\|n. (A similar argument shows us that 19|7).

For n € Z*, we want to count the number of positive divisors of n. For example, the number
EXAMPLE 4.44
               2 has two positive divisors: | and itself. Likewise, 1 and 3 are the only positive divisors of
                3. In the case of 4, we find the three positive divisors 1, 2, and 4.
                   To determine the result for each n € Z*,n > 1, we use Theorem 4.11 and write n =
                Pi p> --- py’, where for each 1 <i <k, p, is a prime and e; > 0. If m|n, then m =
               pi pe vee pf, where 0 < f; <e,          for all 1 <i <k. So by the rule of product, the num-
                ber of positive divisors of 1 is

(e+     l(en + 1)--- (a, +1).

For example, since 29,338,848,000 = 2°3°5°7°11, we find that 29,338,848,000 has
                (84+ 19541034184 1d 4+ 1D = (9)(6)(4)(4)(2) = 1728 positive divisors.
                   Should we want to know how many of these 1728 divisors are multiples of 360 = 2°3°5,
               then we must realize that we want to count the integers of the form 2" 3°5°7% 11% where
                  3<%    <8,        2<t    <5,         L<t     <3,         O<t%4 <3,         and       O<t5 <1.

Consequently,     the number   of positive    divisors   of 29,338,848,000     that are divisible    by
                360 is

[(8 — 3) + 1[(5 — 2) + 1]LG3 — 1) + 1G — 9) + 1G - 0) + 1]
                                                                       = (6)(4)(3)(4)(2) = 576.
                  To determine how many of the 1728 positive divisors of 29,338,848,000 are perfect
               squares, we need to consider all divisors of the form 2°! 3°?5°} 7% 11%, where each of 5, 52, 53,
               S4, S5 iS an Even nonnegative integer. Consequently, here we have

5 choices for s; — namely, 0, 2, 4, 6, 8;
                   3 choices for sz — namely, 0, 2, 4;
                   2 choices for each of 53, s; —namely,       0, 2; and
                   1 choice for ss; — namely, 0.

It then follows   that the number    of positive divisors      of 29,338,848,000     that are perfect
               squares is (5)(3)(2)(2)(1) = 60.

For our next example we shall need the multiplicative counterpart of the Sigma-notation
               (for addition) that we first observed in Section 1.3. Here we use the capital Greek letter 1
               for the Pi-notation.
                   We can use the Pi-notation to express the product x)%2%3x4%x5x6, for example, as I]         7 X,.
               In general, one can express the product of the n — m + 1 terms X, X41, Xm42,---++Xns
               where m,n € Z and m <n, as |]"_,, x,. AS with the Sigma-notation the letter i is called
               the index of the product, and here this index accounts for all » — m + 1 integers starting
               with the lower limit m and continuing on up to (and including) the upper limit n.
                  This notation is demonstrated in the following:

1) [| jes x, = x3x4x5%6x7 = TTj-3 x,, Since there is nothing special about the letter 7;
                   2) | [8_-,i =3-4-5-6
                                    = 6!/2!:
240                  Chapter 4 Properties of the Integers: Mathematical Induction

3) []j-, i = mn + 10m +. 2)--- (2 — 1)() = nt/(n — 1)!, for all m,n € Zt with
                                               m <ny;and

4) |] j27 x1 = xpxgxoxioxn = [Geo x74; = [ [feo xu-y.

Ifm,n€ Zt, letm = p\' py --- py andn = pi pe ..- p!*, with each p; prime and0 < e,
      EXAMPLE 4.45 |                   and 0 < f; for all 1 <i <+t. Thenif a; = minf{e,, f;}, the minimum (or smaller) of e; and
                                       fj, and b; = max{e;, f,}, the maximum (or larger) of e; and f;, for all 1 <i                                           < t, we have
                                                                                           t

ged(m,
                                            n) = p™ p?..- pw = [| pi"                                 and     lem(m,
                                                                                                                n) = pit py ++ pr = T] p?.
                                                                                 a;                                                                      b,

f

For     example,        let m = 491,891,400 = 2°3°577711'13?                             and     let n = 1,138,845,708 =
                                       273°7'11713°17'.           Then      withp; = 2, p2 = 3, p3 =5,                     pa =7,         ps = 11, po = 13,                and
                                      p7 = 17, we find a; = 2, a2 = 2, a3 = 0 (the exponent of 5 in the prime factorization of n
                                      must be 0, because 5 does not appear in the prime factorization), a4 = 1, a5 = 1, a6 = 2,
                                      and a7 = 0. So

gcd(m, n) = 27375°7'11'13717° = 468,468.
                                      We also have

lem(m, n) = 2°3°577711713°17! = 1,195,787,993,400.

Our final result for this section ties together the Fundamental Theorem of Arithmetic
                                       with the fact that any two consecutive integers are relatively prime (as observed in Exercise
                                       21 for Section 4.4),

Here we seek an answer to the following question. Can we find three consecutive posi-
      EXAMPLE 4.46
                                      tive integers whose product is a perfect square     — that is, do there exist m,n € Z* with
                                      m(m + 1)(m + 2) =n?
                                          Suppose that such positive integers m, n do exist. We recall that gcd(m, m+ 1) = 1=
                                      gcd(m + 1, m + 2), so for any prime p, if p|(m + 1), then p         m and p {(m + 2). Further-
                                      more, if p|(m + 1), it follows that p|n?. And since n° is a perfect square, by the Fundamental
                                      Theorem of Arithmetic, we find that the exponents on p in the prime factorizations of both
                                      m +1 and n* must be the same even integer. This is true for each prime divisor of m + 1,
                                      som + 1 is a perfect square. With n? and m + 1 both being perfect squares, we conclude
                                      that the product m(m + 2) is also a perfect square. However, the product m(m + 2) is such
                                       that m? < m? + 2m = m(m +2) <m? +2m +1 = (m+ 1)’. Consequently, we find that
                                       m(m + 2) is wedged between two consecutive perfect squares — and                                         is not equal to either
                                       of them. So m(m + 2) cannot be a perfect square, and there are no three consecutive positive
                                       integers whose product is a perfect square.

3. Let r€ Z* and pj, po, p3...-.. p; be distinct primes. If
LD                                                                                             <7 bas prime factorization p@ pp - - ps, what is the
                                                                                        prime factorization of (a) m?? (b) m3?
1. Write each of the following integers as a product of primes
       Ay)     RD        Ne                                                                4. Verify Lemma 4.3.
      Py’ pr        ++: p,,   where O <n, forall
                                             1 <i<k
                                                     d                                     5. Prove that ,/p is irrational for any prime p.
                                               an                   vt           Dy,
                                               ame       PLS Pa             Pk             6. The    change    machine     at Cheryll’s           laundromat         contains
      a) 148,500                b) 7,114,800               c) 7,882,875                 n quarters,     2n    nickels,   and     4n    dimes,      where       n € Z~.    Find
2. Determine the greatest common divisor and the least com-                            all values     of n so that these             coins     total   & dollars,       where
mon multiple for each pair of integers in the preceding exercise.                       keZ.
                                                                                      45      The Fundamental Theorem of Arithmetic                                241

7. Find the number of positive divisors for each integer in                    a) {4, 8, 16, 32}?              b) {4, 8, 16, 32, 64}?
Exercise 1.                                                                     c) (4, 8, 9, 16, 27, 32, 64, 81, 243}?
8. a) How many positive divisors are there for                                 d) {4, 8, 9, 16, 25, 27, 32, 64, 81, 125, 243, 625, 729,
                          n= 29375879119 13°37?                                 3125}?
    b) For the divisors in part (a), how many are                               OP PP PP. PPG.                                                  grr      3   r’,   rh,

i)    divisible by 2°3457 117377?                                  where p. g, and r are distinct primes?
           ii)     divisible by 1,166,400,000?                             20. Write a computer program (or develop an algorithm) to find
          iii)     perfect squares?                                        the prime factorization of an integer n > 1.
          iv)      perfect squares that are divisible by 273457117?
                                                                           21. In triangle ABC the length of side BC is 293. If the length
           Vv)     perfect cubes?
                                                                           of side AB is a perfect square, the length of side AC is a power
          vi)      perfect cubes that are multiples of
                                                                           of 2, and the length of side AC is twice the length of side AB,
                   2193°5?7° 117132377?
                                                                           determine the perimeter of the triangle.
       vii)        perfect squares and perfect cubes?
                                                                           22. Express each of the following in simplest form.
9, Letm,n € Z* with mn = 24345°7!11°13'. Hflem(m, n) =
                                                                                       10
273°577!11713!, what is gcd(m, n)?
10. Extend the results in Example 4.45 and find the greatest
                                                                                a) | ](-1)
                                                                                      7=]

common divisor and least common multiple for the three inte-                          2n+1

gers in Exercise 1.                                                             b) |] (—1)', wheren € Z*
                                                                                       r=]

11. How           many   positive   integers   n   divide    100137n+                  8
                                                                                             G+
                                                                                               .          ‘
                                                                                                      1)G +2)
248396544?
                                                                                °) I               G — DW)
12. Let a € Z~. Find the smallest value of a for which 2a is a                         2n
perfect square and 3a is a perfect cube.                                        d) I]        Spray                 Where         eZ

13. a) Let a € Z*. Prove or disprove: (i) If 10|a*, then 10]a;
    and (ii) If 4|a, then 4|a.                                             23. a) Let n = 88,200. In how many ways can one factor n as
                                                                                ab where 1 <a                 <n,1<b<n,and                    ged(a, b) = 1. (Note:
    b) Generalize the true result(s) in part (a).
                                                                                Here        order     is not      relevant.       So,   for example,     a = 8, b =
14, Let a, b,c € {0, 1, 2,..., 9} with at least one of a, b,c                   11,025, anda = 11,025, b = 8 result in the same unordered
nonzero. Prove that the six-digit integer abcabc is divisible by                factorization.)
at least three distinct primes.                                                 b) Answer part (a) for 2 = 970,200.
15. Determine the smallest perfect square that is divisible by 7!               c) Generalize the results in parts (a) and (b).
16. For all n € Z*, prove that n is a perfect square if and only           24. Use the Pi-notation to write each of the following.
if # has an odd number of positive divisors.
                                                                                a) (17 + 1)(2? 4+ 2)(3* + 3)(4 + 4)(5? +5)
17. Find the smallest positive integer n for which the product
1260 X n is a perfect cube.                                                     b) (l+x)Q 4x70                       4+.4)0 +24)                4+.x°)
18. Two hundred coins numbered 1 to 200 are put in a row                        ec) d+ x04                0°)        +2°)0 +47)04 x90                    4x!)
across the top of a cafeteria table. Two hundred students are              25. Prove that ifn € Z* and n > 2, then
assigned numbers (from 1 to 200) and are asked to turn over                                                   -             1           n+l
certain coins. The student assigned number !| is supposed to turn
                                                                                                          1-2              i2            2n
over all the coins. The student assigned number 2 is supposed to
                                                                           26. When does a positive integer n have exactly
turn over every other coin, starting with the second coin. In gen-
eral, the student assigned the number n, for each           1 < n < 200,        a) two positive divisors?                       — b) three positive divisors?
is supposed to turn over every nth coin, starting with the nth                  c) four positive divisors?                        —_d) five positive divisors?
coin.
                                                                           27. Let ne Z*. We                      say that n is a perfect integer if 2n
    a) How many times will the 200th coin be turned over?                  equals the sum of all the positive divisors of n. For example,
    b) Will any other coin(s) be turned over as many times as              since 2(6) =            12 =   1+42+43-+4             6, it follows that 6 is a perfect
    the 200th coin?                                                        integer.
    c) Will any coin be turned over more times than the 200th                   a) Verify that 28 and 496 are perfect integers.
    coin?                                                                       b) Ifm € Z* and 2” — {is prime, prove that 2”~'(2” — 1)
19. How many different products can one obtain by multiplying                   is a perfect integer. [You may find the result from part (a)
any two (distinct) integers in the set                                          of Exercise 2 for Section 4.1 useful here.]
242      Chapter 4 Properties of the Integers: Mathematical Induction

4.6
      Summary and Historical Review
                          According to the Prussian mathematician Leopold Kronecker (1823-1891), “God made the
                          integers, all the rest is the work of man... . All results of the profoundest mathematical
                          investigation must ultimately be expressible in the simple form of properties of the integers.”
                          In the spirit of this quotation, we find in this chapter how the handiwork of the Almighty
                          has been further developed by men and women over the last 24 centuries.
                              Starting in the fourth century B.C. we find in Euclid’s Elements not only the geometry of
                          our high school experience but also the fundamental ideas of number theory. Propositions
                          1 and 2 of Euclid’s Book VII include an example of an algorithm to determine the greatest
                          common divisor of two positive integers by using an efficient technique to solve, in a finite
                          number of steps, a specific typeof problem.
                             The term algorithm, like its predecessor algorism, was unknown to Euclid. In fact, this
                          term did not enter the vocabulary of most people until the late 1950s when the computer
                          revolution began to make its impact on society. The word comes from the name of the
                          famous    Islamic mathematician,          astronomer, and textbook writer Abu     Ja’far Mohammed
                          ibn Musa al-Khowarizmi (c. 780-850). The last part of his name, al-Khowarizmi, which is
                          translated as “a man from the town of Khowarizm,” gave rise to the term algorism. The word
                          algebra comes      from al-jabr, which        is contained in the title of al-Khowarizmi’s textbook
                          Kitab al-jabr w’al muquabaia. Translated into Latin during the thirteenth century, this book
                          had a profound impact on the mathematics developed during the European Renaissance.

“ge

Euclid (c. 400 B.c.)                            Al-Khowarizmi (c. 780-850)

As mentioned in Section 4.4, our use of the word algorithm connotes a precise step-by-
                          step method for solving a problem in a finite number of steps. The first person credited with
                          developing the concept of a computer algorithm was Augusta Ada Byron (1815-1852),
                          the Countess of Lovelace. The only child of the famous poet Lord Byron and Annabella
                          Millbanke, Augusta Ada was raised by a mother who encouraged her intellectual talents.
                          Trained in mathematics by the likes of Augustus DeMorgan (1806-1871), she continued
                          her studies by assisting the gifted English mathematician Charles Babbage (1792-1871) in
                          the development of his design for an early computing machine — the ‘Analytical Engine.”
                                                   4.6 Summary and Historical Review        243

The most complete accounts of this machine are found in her writings, wherein one finds
a great deal of literary talent along with the essence of the modern computer algorithm.
Further details on the work of Charles Babbage and Augusta Ada Byron Lovelace can be
found in Chapter 2 of the work by S. Augarten [1].

Augusta Ada Byron, Countess of Lovelace (1815-1852)

In the century following Euclid, we find some number theory in the work of Eratosthenes.
However, it was not until five centuries later that the first major new accomplishments in the
field were made by Diophantus of Alexandria. In his work Arithmetica, his integer solutions
of linear (and higher-order) equations stood as a mathematical beacon in number theory
until the French mathematician Pierre de Fermat (1601—1665) came on the scene.
    The problem we stated in Theorem 4.8 was investigated by Diophantus and further
analyzed during the seventh century by Hindu mathematicians, but it was not actually
solved completely until the 1860s, by Henry John Stephen Smith (1826-1883).
    For more on some of these mathematicians and others who have worked in the theory
of numbers,   consult L. Dickson   [4]. Chapter 5 in I. Niven,   H. S. Zuckerman,      and H. L.
Montgomery [10] deals with the solutions of Diophantine equations and their applications.
   In the work Formulario Matematico, published in 1889, Giuseppe Peano (1858-1932)
formulated the set of nonnegative integers on the basis of three undefined terms: zero,
number, and successor. His formulation is as follows:

a) Zero is a number.
  b) For each number n, its successor is a number.
  c) No number has zero as its successor.
  d) If two numbers m, n have the same successor, then m = n.
  e) If T is a set of numbers where 0 € 7, and where the successor of n is in 7 whenever
     nisin 7, then T is the set of all numbers.

In these postulates the notion of order (successor) and the technique called mathematical
induction are seen to be intimately related to the idea of number (that is, nonnegative
integer). Peano attributed the formulation to Richard Dedekind (1831-1916), who was the
first to develop these ideas; nonetheless, these postulates are generally known as ‘“‘Peano’s
postulates.”
244   Chapter 4 Properties of the Integers: Mathematical Induction

The first European to apply the Principle of Mathematical Induction in proofs was the
                       Venetian scientist Francisco Maurocylus (1491-1575). His book, Arithmeticorum Libri
                       Duo (published in 1575), contains a proof, by mathematical induction, that the sum of
                       the first n positive odd integers is n”. In the next century, Pierre de Fermat made further
                       improvements on the technique in his work involving “the method of infinite descent.”
                       Blaise Pascal (c. 1653), in proving such combinatorial results as C(n, k)/C(n,k +1) =
                       (k + 1)/(n —k),0<k <n — 1, used induction and referred to the technique as the work
                       of Maurocylus. The actual term mathematical induction was not used, however, until the
                       nineteenth century when it appeared in the work of Augustus DeMorgan (1806-1871). In
                       1838 he described the process with great care and gave it the name mathematical induction.
                       (An interesting survey on this topic is found in the article by W. H. Bussey [2].)
                           The text by B. K. Youse [13] illustrates many varied applications of the Principle of
                       Mathematical Induction in algebra, geometry, and tri gonometry. For more on the relevance
                       of this method of proof to the problems of programming and the development of algorithms,
                       the text by M. Wand [12] (especially Chapter 2) provides ample background and examples.
                           More on the theory of numbers can be found in the texts by G. H. Hardy and E. M.
                       Wright [5], W. J. LeVeque        [7, 8], and I. Niven, H. S. Zuckerman,      and H. L. Montgomery
                       [10]. Ata level comparable to that of this chapter, Chapter 3 of V. H. Larney [6] provides an
                       enjoyable introduction to this material. The text by K. H. Rosen [11] integrates applications
                       in cryptography and computer science in its development of the subject. The journal article
                       by M. J. Collison [3] examines the history of the Fundamental Theorem of Arithmetic. The
                       articles in [9] recount some interesting developments in number theory.

REFERENCES

. Augarten, Stan. BIT by BIT, An Illustrated History of Computers. New York: Ticknor & Fields,
                            —

1984.
                             2. Bussey, W. H. “Origins of Mathematical Induction.” American Mathematical Monthly 24
                                  (1917): pp. 199-207.
                             3. Collison, Mary Joan. “The Unique Factorization Theorem: From Euclid to Gauss.” Mathe-
                                  matics Magazine 53 (1980): pp. 96-100.
                             4, Dickson, L. History of the Theory of Numbers. Washington, D.C.: Carnegie Institution of
                                  Washington, 1919. Reprinted by Chelsea, in New York, in 1950.
                             5. Hardy, Godfrey Harold, and Wright, Edward Maitland. An Introduction to the Theory of Num-
                                  bers, 5th ed. Oxford: Oxford University Press, 1979.
                             6. Larney, Violet Hachmeister. Abstract Algebra: A First Course. Boston: Prindle, Weber &
                                  Schmidt, 1975,
                             7. LeVeque, William J. Elementary Theory of Numbers, Reading, Mass.: Addison-Wesley, 1962.
                               . LeVeque, William J. Topics in Number Theory, Vols. land Il. Reading, Mass.: Addison-Wesley,
                            oO

1956.
                             9. LeVeque, William J., ed. Studies in Number Theory. MAA Studies in Mathematics, Vol. 6.
                                  Englewood Cliffs, N.J.: Prentice-Hall, 1969. Published by the Mathematical Association of
                                  America.
                           10. Niven, Ivan, Zuckerman, Herbert S., and Montgomery, Hugh L. An Introduction to the Theory
                                  of Numbers, 5th ed. New York: Wiley, 1991.
                           11. Rosen, Kenneth H. Elementary Number Theory, 4th ed. Reading, Mass.: Addison-Wesley,
                                  2000.
                           12, Wand, Mitchell. Induction, Recursion, and Programming. New York: Elsevier North Holland,
                                  1980.
                           13. Youse, Bevan K. Mathematical Induction. Englewood Cliffs, N.J.: Prentice-Hall,      1964.
                                                                                                                  Supplementary Exercises               245

8. Letn € Z* where n is odd and » is not divisible by 5. Prove
                   SUPPLEMENTARY EXERCISES                                         that there is a power of n whose units digit is 1.
                                                                                    9. Find the digits x, y, z where (xyz)o = (zyx)o.

1. Let a, d be fixed integers. Determine a summation for-                        10. If »¢Z*, how         many      possible   values   are   there    for
mula for a + (a@+d)+ (a+ 2d)+---+ (@4+ (n— 1)d), for                               gcd(n, n + 3000)?
néZ*. Verify your result by mathematical induction.                               11. Ifn € Z* and n > 2, prove that 2” < 7") < 4".
  2. Inthe following pseudocode program segment the variables                      12. Ifn € Z*, prove that 57 divides 7”? + 87+,
n and sum are integer variables. Following the execution of this                   13. For all n € Z*, show that if n > 64, then n can be written
program segment, which value of n is printed?                                      as a sum of 5’s and/or 17’s.
                           n:=3                                                    14.   Determine all a, b € Z such that s+       4   = a:
                           Sum     :=0
                                                                                   15. Given re Zt, write r=ro try: lOtr-1P                       4+---4
                           while         sum    < 10,000     do
                                                                                   r, - 10", whereO <r, <9for0<i<n—Il,andO<r,                      <9.
                                 begin
                                   Ni=n+7                                                a) Prove that 9|r if and only if 9|(7, +r,-) +--+ +r +
                                    sum        := sum+n                                  ry +7).
                                 end                                                     b) Prove that 3|r if and only if 3[(7, +7,-1)          +++ ++
                           print         n                                               r) + Fo).
3. Consider the following five equations.                                               c) Ift = 137486x225, where x is a single digit, determine
                                                                                         the value(s) of x such that 3|f. Which values of x make t
     1)                                      1=1                                         divisible by 9?
   2)                              1—4= —(1+2)                                     16. Frances spends $6.20 on candy for prizes in a contest. If a
   3)                            1~-44+9=1+2+43                                    10-ounce box of this candy costs $.50 and a 3-ounce box costs
    4)                   1—44+9-16=-(14+24+3+44)                                   $.20, how many boxes of each size did she purchase?

5) 1—4+9-                     164+25=1424+344+45                               17. a) How many positive integers can we express as a product
                                                                                       of nine primes (repetitions allowed and order not relevant)
Conjecture the general formula suggested by these five equa-                           where the primes may be chosen from {2, 3, 5, 7, 11}?
tions, and prove your conjecture.
                                                                                         b) How many of the positive integers in part (a) have at
4. For n € Z*, prove each of the following by mathematical                              least one occurrence of each of the five primes?
induction:
                                                                                   18. Find the product of all (positive) divisors of (a) 1000;
     a) 5|(n°? —n)                                  b) 6|(n3 + 5n)                 (b) 5000; (c) 7000; (d) 9000; (e) p”g", where p, q are dis-
§. Foralln € Z*, let S(n) be the open statement: n? + n + 41                      tinct primes and m, n € Zt; and (f) p"q"r*, where p, g, r are
is prime.                                                                          distinct primes and m,n, k € Z*.
     a) Verify that S(n) is true forall 1 <n <9.                                   19. a) Ten students enter a locker room that contains 10 lock-
    b) Does the truth of S(&k) imply that of S(k + 1) for all                          ers. The first student opens all the lockers. The second stu-
    keZ*?                                                                              dent changes the status (from closed to open, or vice versa)
                                                                                       of every other locker, starting with the second locker. The
6. For n € Z* define the sum s, by the formula
                                                                                       third student then changes the status of every third locker,
                     ee                             4, aah           n
          Sy   =   —       —        —          ee                              .       starting at the third locker. In general, for 1 < k < 10, the
                    2!      3!      4!                  n}        (n+    1)!           kth student changes the status of every Ath locker, starting
     a) Verify that s; = $                     = 2, and s3 = a.                        with the kth locker. After the tenth student has gone through
                                                                                       the lockers, which lockers are left open?
    b) Compute s4, 55, and S¢.
                                                                                         b) Answer part (a) if 10 is replaced by n € Z*, n > 2.
     c) On the basis of your results in parts (a) and (b), conjec-
    ture a formula for the sum of the terms in s,.                                 20. Let A = {a1, a2, a3, a4, as} C Z*. Prove that A contains a
                                                                                   nonempty subset S where the sum of the elements in S is a mul-
    d) Verify your conjecture in part (c) for all n € Z* by the
                                                                                   tiple of 5. (Here it is possible to have a sum consisting of only
    Principle of Mathematical Induction.
                                                                                   one summand.)
7. For alln € Z, n > O, prove that
                                                                                   21. Consider the set {1, 2, 3}. Here we may write {1, 2, 3} =
    a) 2°"+! 4 ] is divisible by 3.                                                {1,2}U {3}, where 1+2=3. For the set {1, 2, 3,4} we
    b) n° + (n + 1)3 + (n + 2)? is divisible by 9.                                 find that {1, 2, 3, 4} = {1, 4} U {2, 3}, where 14+  4=2+3.
246                  Chapter 4 Properties of the Integers: Mathematical Induction

However,      things   change    when   we    examine   the set                           31. Leta € Z* with u the units digit of n. Prove that 7|n if and
{1, 2, 3, 4, 5}. In this case, if C C {1, 2,3, 4, 5} and we let                           only if 7|(44* — 24).
Sc denote the sum of the elements in C, then we find that there                           32. Let m,n € Zt with 19m + 90+ 8” = 1998.                                    Determine
is no way to write {1, 2, 3, 4,5} = AU B, with AN B = @and                                m, n so that (a) 1 is minimal; (b) m is minimal.
Sa = Sp.
                                                                                          33. Catrina selects three integers from {0, 1, 2, 3, 4, 5, 6, 7, 8,
      a) For which n € Z*,n > 3, can we write {1, 2,3,...,                                9} and then forms the six possible three-digit integers (leading
      n}= AUB, with AN B =@ and s, = Sg? (As above, s,                                    zero allowed) they determine. For instance, for the selection 1,
      and sz denote the sums of the elements in A and B, respec-                          3, and 7, she would form the integers 137, 173, 317, 371, 713,
      tively.)                                                                            and 731]. Prove that no matter which three integers she initially
      b) Let n € Z* with n > 3. If we can write {1, 2,3,...,                              selects, it is not possible for all six of the resulting three-digit
      n} = AUB with AN B= 6 and sy, = sg, describe how                                    integers to be prime.
      such sets A and B can be determined.                                                34. Consider the three-row and four-column table shown in
                                                                            Fn+l
22. Determine those integers n for which                       uns    and     4    are    Fig. 4.12. Show that it is possible to place eight of the nine in-
also integers.                                                                            tegers 2, 3, 4,7, 10, 11, 12, 13, 15 in the remaining eight cells
23. Leta, be Z*.                                                                          of the table so that the average of the integers in each row is the
                                                                                          same integer and the average of the integers in each column is
      a) Prove that if a*|b? then ab.
                                                                                          the same integer. Specify which of the nine integers given can-
      b) Is it true that if a7|b? then a|b?                                               not be used and show how the other eight integers are placed in
24. Let n be a fixed positive integer that satisfies the property:                        the table.
For all a, b € Z", if nlab then nla or n|b. Prove that n =                         1 or
n 1s prime.
                                                                                                                                         14
25. Suppose that a, b, k € Z* and that k is not a power of 2.
      a) Prove that if a* + b* ¥ 2, then a* + b* is composite.
      b) Ifn € Z* and n is not a power of 2, prove that if 2” + }
                                                                                                                          1
      is prime, then 7 is prime.

For the next three exercises, recall that H,, F,,, and L, denote
                                                                                                                       Figure 4.12
the mth harmonic, Fibonacci, and Lucas numbers, respectively.
                                                                                          35. Allen writes the consecutive integers 1, 2,3,...,n ona
26. Prove that for alln EN, Ho <1-+7n.
                                                                                          blackboard. Then Barbara erases one of these integers. If the
27. Prove that F, < (5/3)” for alla EN.                                                   average of the remaining integers is 354, what is n and what
28. For n €N, prove that                                                                  integer was erased?
                                                                                          36. Leslie selects a random integer between | and 100 (inclu-
          Lo   thy     +   Late         tly   =   OL,      =    Lage).
                                                                                          sive). Find the probability her selection is divisible by (a) 2 or
                                                  :=0
                                                                                          3; (b) 2, 3, or 5.
29.   a) For the five-digit integers (from 10000 to 99999) how
                                                                                          37. Let m = pj! py p;'p;* and n = pl pP pi ps, where pr,
      many are palindromes and what is their sum?
                                                                                          P2;,   P3,   P4s   P5 are distinct   primes,    and   €},   €2,   €3,   €4;   fis   to,   th,
      b) Write a computer program to check the answer for the                             fs € Z*. How many common divisors are there for m, n?
      sum in part (a).

30. Let        a,b    be   odd   with    a>b.      Prove       that    gced(a, b) =
ged (45%, b).
Relations and
  Functions

t this chapter we extend the set theory of Chapter 3 to include the concepts of relation
         and function. Algebra, trigonometry, and calculus all involve functions. Here, however,
      we shall study functions from a set-theoretic approach that includes finite functions, and
      we shall introduce some new counting ideas in the study. Furthermore, we shall examine
      the concept of function complexity and its role in the study of the analysis of algorithms.
          We take a path along which we shall find the answers to the following (closely related)
      six problems:

1) The Defense Department has seven different contracts that deal with a high-security
            project. Four companies can manufacture the distinct parts called for in each contract,
            and in order to maximize the security of the overall project, it is best to have all four
            companies working on some part. In how many ways can the contracts be awarded
            so that every company is involved?
         2) How many seven-symbol quaternary (0, 1, 2, 3) sequences have at least one occur-
            rence of each of the symbols 0, 1, 2, and 3?
         3) An m X n zero-one matrix is a matrix A with m rows and n columns, such that in
            row i, forall 1 <i <m,    and column j, for all 1 < j <n, the entry a,; that appears is
            either 0 or 1. How many 7 X 4 zero-one matrices have exactly one 1 in each row and
            at least one 1 in each column? (The zero-one matrix is a data structure that arises in
            computer science. We shall learn more about it in later chapters.)
         4) Seven (unrelated) people enter the lobby of a building which has four additional
            floors, and they all get on an elevator. What is the probability that the elevator must
            stop at every floor in order to let passengers off?
         5) For positive integers m, n with m <n, prove that

Yew, "i (n—k)" =0.
                                       k=0

6) For every positive integer n, verify that
                                              i

n! ;— » | _1\k
                                                 1) (, -n i)        _
                                                                        k)".n

Do you recognize the connection among the first four problems? The first three are the
      same problem in different settings. However, it is not obvious that the last two problems
      are related or that there is a connection between them and the first four. These identities,
      however, will be established using the same counting technique that we develop to solve
      the first four problems.

247
248          Chapter 5 Relations and Functions

5.1
       Cartesian Products and Relations
                              We start with an idea that was introduced earlier in Definition 3.11. However, we repeat the
                              definition now in order to make the presentation here independent of this prior encounter.

Definition 5.1          For sets A, B the Cartesian product, or cross product, of A and B is denoted by A X B and
                              equals {(a, b)|a € A, b € B}.

We      say that the elements ofA        X B are ordered pairs. For (a, b), (c,d) € A X B, we
                              have (a, b) = (c, d) if and only ifa = c and b = d.
                                 If A, B are finite, it follows from the rule of product that |A x B| = |A|-|B|. Although
                              we generally will not have A X B = B X A, we willhave |A X B| = |B X Al.
                                  Here A CU;        and B C Ur, and we may find that the universes are different
                                                                                                             — that                           is,
                              U, ~ Uy. Also, even if A, B CU, it is not necessary that A X B CU, so unlike the cases
                              for union and intersection, here (AL) is not necessarily closed under this binary operation.
                                  We can extend the definition of the Cartesian product, or cross product, to more than two
                              sets. Letn     € Z*, n > 3. For sets Aj, Az, ..., An, the (n-fold) product of Ay, Az,...,                      Ay
                              is denoted by A,      X Az X---       X A, and equals {(a|, a2, ..., dn) la; € A,, 1 <i <7n}.” The
                              elements of Ay X Az X +--+ X A, are called ordered n-tuples, although we generally use the
                              term triple in place of 3-tuple. As with ordered pairs, if (a|, a2,..., Gn), (Bb), bo, ..., bn) E
                              A, X Ap X--++ X A,, then (a1, d2,..., G,) = (b), bo, ..., by) if and only if a, = 5; for
                              all 1 <i     <a.

EXAMPLE 5.1
                              Let A = {2, 3, 4}, B = {4, 5}. Then
                                a) AX       B = {(2, 4), (2,5), (3, 4), G, 5), (4 4), (4, 5)}.
                                b)    BX    A = {(4, 2), (4, 3), (4, 4), G, 2), 6, 3), (5, 4}.
                                 c) B=       BX B= {(4, 4), (4,5), (5, 4, (5, 5)}.
                                d) B’=BX           BX    B=     {(a, b, c)\a, b, c € B}; for instance, (4,5, 5) € B?.

The set R X R = {(x, y)|x, y € R} is recognized as the real plane of coordinate geometry
      EXAMPLE 5.2
                              and two-dimensional calculus. The subset R* X R®* is the interior of the first quadrant
                              of this plane. Likewise R* represents Euclidean three-space, where the three-dimensional
                              interior of any sphere (of positive radius), two-dimensional planes, and one-dimensional
                              lines are subsets of importance.

Once again let A = {2, 3, 4} and B = {4, 5}, as in Example 5.1, and let C = {x, y}. The
      EXAMPLE 5.3
                              construction of the Cartesian product A X B can be represented pictorially with the aid of
                              a tree diagram, as in part (a) of Fig. 5.1. This diagram proceeds from left to right. From

"When dealing with the Cartesian product of three or more sets, we must be careful about the lack
                              of associativity. In the case of three sets, for example, there is a difference between any two of the sets
                              A; X Az X A3, (Al X Az) X Az, and A X (Aa X A3) because their respective elements are ordered triples
                              (4. 42, a3), and the distinct ordered pairs ((a), a2). a3) and (a,, (a2, a3)). Although such differences are im-
                              portant in certain instances, we shall not concentrate on them here and shall always use the nonparenthesized form
                              A X A2 X Aj. This will also be our convention when dealing with the Cartesian product of four or more sets.
                                                                   5.1     Cartesian Products and Relations            249

the left-most endpoint, three branches originate — one for each of the elements of A. Then
              from each point, labeled 2, 3, 4, two branches emanate
                                                                  — one                  for each of the elements 4,
              5 of B. The six ordered pairs at the right endpoints constitute the elements (ordered pairs)
              of A X B. Part (b) of the figure provides a tree diagram to demonstrate the construction of
              B X A. Finally, the tree diagram in Fig. 5.1 (c) shows us how to envision the construction
              ofA X B X C, and demonstrates that |A X B X C]) =12=3*2X2=|A||BI|Cl.

(4, 2)

(2, 4)                                 4                   (4, 3)

(2, 5)
                                                   (3, 4)                                                     (4, 4}

(5, 2)
                                                   (4, 4)

(4, 5)                                 5                   (5, 3)

(5, 4)

(a)                                     AxB   |   (b)                                          BX A
                                                                                               (2, 4, x)
                                               —"                        4) ox                 @ 4.)
                                                                                               (2, 5, x)
                                                  —<                     \—<                   O84)

4) o——                (3, 4, x)
                                             se                                                35.0
                                                                                               (3,5, x
                                                                 (3, —_?
                                                                     5)                        3.5.)

(4, 4, x)
                                                  <<"                                           ay
                                                                                                 , 5D, X)
                                                                 (4, Je                            Sy)

(c)                                                                                         AXBxc

Figure 5.1

In addition to their tie-in with Cartesian products, tree diagrams also arise in other
              situations.

At the Wimbledon Tennis Championships, women play at most three sets in a match. The
EXAMPLE 5.4
              winner is the first to win two sets. If we let N and E denote the two players, the tree diagram in
              Fig. 5.2 indicates the six ways in which this match can be won. For example, the starred line
              segment (edge) indicates that player E won the first set. The double-starred edge indicates
              that player N has won the match by winning the first and third sets.
250          Chapter 5 Relations and Functions

First set       Second set        Third set
                                                                                        (when needed)
                                                     Figure 5.2

Tree diagrams are examples of a general structure called a tree. Trees and graphs are
                              important structures that arise in computer science and optimization theory. These will be
                              investigated in later chapters.

For the cross product of two sets, we find the subsets of this structure of great interest.

Definition 5.2          For sets A, B, any subset of A X B is called a (binary) relation from A to B. Any subset
                              of A X A is called a (binary) relation on A.

Since we will primarily deal with binary relations, for us the word “relation” will mean
                              binary relation, unless something otherwise is specified.

With A, B as in Example 5.1, the following are some of the relations from A to B.
      EXAMPLE 5.5
                                a) 2                                             b) {(2, 4)}
                                ce) {(2, 4), 2, 5)}                              d) (2,4), GB, 4, 4,0}
                                e) {(2, 4), (3, 4), (4, 5)}                      f)AXB
                                  Since |A X B| = 6, it follows from Definition 5.2 that there are 2° possible relations
                              from A to B (for there are 2° possible subsets of A X B).

For finite sets A, B with |A] = m and |B} = n, there are 2” relations from A to B,
                                including the empty relation as well as the relation A X B itself.
                                    There are also 2°” (= 2") relations from B to A, one of which is also @ and another
                                of which is B X A. The reason we get the same number of relations from B to A as we
                                have from A to B is that any relation ®; from B to A can be obtained from a unique
                                relation 2 from A to B by simply reversing the components of each ordered pair in
                                Ry (and vice versa).

| EXAMPLE5.6            |     For B = {1, 2}, let A = P(B) = {@, {1}, {2}, {1, 2}}. The following is an example of a
                              relation on A: R = {(B, B), (, {1}), GY, {2}), @, C1, 23), Ca, (1D, A,          1, 2),
                              ({2}, {2}), ({2}, (1, 2}), (1, 2}, (1, 2})}.
                                                                        We can say that the relation & is the subset relation
                              where (C, D) € Rif and only ifC, DC BandC CD.
                                                                                5.1    Cartesian Products and Relations    251

EXAMPLE 5.7       With A = Zt, we may define a relation & on set A as {(x, y)|x < y}. This is the familiar
             .       “is less than or equal to” relation for the set of positive integers. It can be represented
                     graphically as the set of points, with positive integer components, located on or above the
                     line y = x in the Euclidean plane, as partially shown in Fig. 5.3. Here we cannot list the
                     entire relation as we did in Example 5.6, but we note, for example, that (7, 7), (7, 11) eR,
                     but (8, 2) ¢ KR. The fact that (7, 11) € R can also be denoted by 7 R 11; (8, 2) ¢ R becomes
                     8 Fi 2. Here 7 R11 and 8 F 2 are examples of the infix notation for a relation.

y   4

4

3

2

1

J)          |      2     3      4
                                                       Figure 5.3

Our last example helps us to review the idea of a recursively defined set.

EXAMPLE 5.8   ]   Let R be the subset of N X N where & = {(m, n)|n = 7m}. Consequently, among the
             .       ordered pairs in R one finds (0, 0), (1, 7), (11, 77), and (15, 105). This relation & on N
                     can also be given recursively by

1) (0,0) € R; and
                          2) If (s,t) ER, then(s +1,f+ 7)              ER.

We use the recursive definition to show that the ordered pair (3, 21) (from N X N) is in
                     R.    Our derivation is as follows:        From    part (1) of the recursive definition we start with
                     (0, 0) € R. Then part (2) of the definition gives us
                            )    0,0)ER>SO+1,04+7                 =, 7        ER;
                           ii)   ,7)€R314+1,74+7)                 = (2, 14) € R; and
                          iii)   (22,1) eR     (241,   1447)           = 3, 2         ER.

We close this section with these final observations.

1) ForanysetA,AX@=8.(fA X@#@,let(a,b)€                                     AX W.Thenaé€ Aandbe     &.
                             Impossible!) Likewise, 0 X A = @.
                          2) The Cartesian product and the binary operations of union and intersection are inter-
                             related in the following theorem.

THEOREM   5.1        For any sets A, B, C CU:

a) AX (BNC) =(AX B)N(AXC)
                          b) AX (BUC) =(AX B)U(AXC)
252               Chapter 5 Relations and Functions

c) (ANB) XC =(AXC)N(BXC)
                                         d) (AUB) XC =(AXC)U(BXC)
                                     Proof: We prove part (a) and leave the other parts for the reader. We use the same concept of
                                     set equality (as in Definition 3.2 of Section 3.1) even though the elements here are ordered
                                     pairs. For all a, bE U, (a;b)E AX (BNC) SaeA                           and bE BNC >a €EA and
                                     be B,CeacdA, be BandacA, beCe(a,b)€                                    AX Band(a, bbe AxCs
                                     (a,b)Ee(AX B)N(A XC).

7, a) If A = {1, 2,3, 4,5} and B = {w, x,         y, z}, how many
                              34a teh          SER                                elements are there in P(A X B)?

1. IfA = {1, 2,3, 4}, B = {2, 5}, and
                                    C = {3, 4, 7},                                b) Generalize the result in part (a).
determine
        A X B; BX A; AU(B XC); (AUB) XC;                                        8. Logic chips are taken from a container, tested individually,
(AX C)U(B XC).                                                                and labeled defective or good. The testing process is continued
                                                                              until either two defective chips are found or five chips are tested
  2. If A= {1, 2,3}, and B= {2, 4,5}, give examples of                        in total. Using a tree diagram, exhibit a sample space for this
(a) three nonempty relations from A to B; (b) three nonempty                  process.
relations on A.
                                                                               9. Complete the proof of Theorem 5.1.
3. For    A, B     as   in   Exercise    2,   determine   the   following:
                                                                              10. A rumor is spread as follows. The originator calls two peo-
(a) |A X BI; (b) the number of relations from A to B; (c) the                 ple. Each of these people phones three friends, each of whom in
number of relations on A; (d) the number of relations from A                  turn calls five associates. If no one receives more than one call,
to B that contain (1, 2) and (1, 5); (e) the number of relations              and no one calls the originator, how many people now know the
from A to B that contain exactly five ordered pairs; and (f) the              rumor? How many phone calls were made?
number of relations on A that contain at least seven elements.
                                                                              11. For A, B, C CU, prove that
4. For which sets A, B is it true that A X B = B X A?
                                                                                           AX (B—C)=(AX B)—-(AXC).
5. Let A, B, C, D be nonempty sets.                                          12. Let A, B be sets with |B| = 3. If there are 4096 relations
      a) Prove that A X B CC             X D if and only if ACC         and   from A to B, what is |A|?
      BCD.                                                                    13. Let RON XN where (m,n) € RK if (and only if) n=
      b) What happens to the result in part (a) if any of the sets            5m + 2. (a) Give a recursive definition for ®. (b) Use the
      A, B, C, Dis empty?                                                     recursive definition from part (a) to show that (4, 22) € &R.

14. a) Give a recursive definition for the relation             ARC
  6. The men’s final at Wimbledon is won by the first player to
                                                                                  Z* X Z* where (m, n) € KR if (and only if) m >n.
win three sets of the five-set match. Let C and M denote the
players. Draw a tree diagram to show all the ways in which the                    b) From the definition in part (a) verify that (5, 2) and
match can be decided.                                                             (4, 4) are in KR.

5.2
          Functions: Plain and One-to-One
                                     In this section we concentrate on a special kind of relation called a function. One finds
                                     functions in many different settings throughout mathematics and computer science. As for
                                     general relations, they will reappear in Chapter 7, where we shall examine them much more
                                     thoroughly.

Definition 5.3                For nonempty sets A, B, a function, or mapping, f from A to B, denoted f: A— B,isa
                                     relation from A to B in which every element of A appears exactly once as the first compo-
                                     nent of an ordered pair in the relation.
                                                                        5.2   Functions: Plain and One-to-One        253

We often write f(a) = b when (a, b) is an ordered pair in the function f. For (a, b) € f,
                 bis called the image of a under f, whereas a is a preimage of b. In addition, the definition
                 suggests that f is a method for associating with each a € A the unique element f(a) =
                 b € B. Consequently, (a, b), (a, c) € f implies b = c.

ForA = {1, 2, 3} andB = {w, x, y, z}, f = {(, w), (2, x), (3, x)} is a function, and con-
EXAMPLE 5.9
                 sequently a relation, from A to B. R, = {(1, w), (2, x)} and Ry = {(1, w), (2, w), (2, x),
                 (3, z)} are relations, but not functions, from A to B. (Why?)

Definition 5.4   For the function f: A — B, A 1s called the domain of f and B the codomain of f. The
                 subset of B consisting of those elements that appear as second components in the ordered
                 pairs of f is called the range of f and is also denoted by f(A) because it is the set of
                 images (of the elements of A) under f.

In Example 5.9, the domain off = {1, 2, 3}, the codomain off = {w, x, y, z}, and the
                 range off = f(A) = {w, x}.

A pictorial representation of these ideas appears in Fig. 5.4. This diagram suggests that a
                 may be regarded as an input that is transformed by f into the corresponding output, f(a).
                 In this context, a C++ compiler can be thought of as a function that transforms a source
                 program (the input) into its corresponding object program (the output).

A                         B
                                                 Figure 5.4

Many interesting functions arise in computer science.
EXAMPLE 5.10
                   a) A common function encountered is the greatest integer function, or floor function.
                      This function f/: R > Z, is given by

f(x) = Lx] = the greatest integer less than or equal to x.

Consequently, f(x) = x, if x € Z; and, when x € R — Z, f(x)                   is the integer to the
                      immediate left of x on the real number line.
                           For this function we find that

1)   [3.8]   = 3, [3] = 3, |-3.8]       = —4,   |-3]    = —3;
                      2)   [7.1    4+ 8.2] = [15.3]   = 15=748        = [7.1] + [8.2]; and
                      3)   (7.74 8.4]     = [16.1]    = 16 4 15 =7+4+8        = [7.7] + [8.4].
254         Chapter 5 Relations and Functions

b) A second function — one related to the floor function in part (a) —is the ceiling func-
                                     tion. This function g: R > Z is defined by
                                                                    g(x) = [x] = the least integer greater than or equal to x.

So g(x) = x whenx                   € Z, but whenx € R — Z, then g(x) is the integer to the immediate
                                         right of x on the real number line. In dealing with the ceiling function one finds that
                                         1) [3] = 3, [3.01] = [3.7] = 4 = [4], [-3] = -3, [-3.01] = [-3.7] = -3;
                                         2) [3.6 +4.5] = [8.1] =9 =445 = [3.6] + [4.5]; and
                                         3) [3.34+4.2] = [7.5] =849=445 = [3.3] 4+ [4.2].
                                   Cc) The function trunc (for truncation) is another integer-valued function defined on R.
                                         This function deletes the fractional part of a real number. For example, trunc(3.78)
                                         = 3, trunc(5) = 5, trunc(—7.22) = —7. Note that trunc(3.78) = [3.78| = 3 while
                                         trunc(—3.78) = [—3.78] = —3.
                                  d) In storing a matrix in a one-dimensional array, many computer languages use the row
                                         major implementation. Here, if A = (4;j)mxn iS anm X n matrix, the first row of A is
                                         stored in locations 1, 2, 3,..., n of the array if we start with a; in location 1. The entry
                                         a>, is then found in positionn + 1, while entry a34 occupies position 2n + 4 in the array.
                                         In order to determine the location of an entry a;; from A, where 1 <i <m,1                                           <j   <n,
                                         one defines the access functionf from the entries of A to the positions 1, 2, 3, ..., mn
                                         of the array. A formula for the access function here is f(a;,) = (i — 1)n + j.

ay\
                                   G12    ae
                                                ‘1Gin|       4@21      422     |°''|@2nj;        431         [tc'             Qij         ute          Ginn

nnvnt+iln+2---                   2n        2n4+1---                G@-—Daty--:             (m—1)n4+n
                                                                                                                                                      (= mn)

We may use the floor and ceiling functions in parts (a) and (b), respectively, of Example
      EXAMPLE 5.11
                                 5.10 to restate some of the ideas we examined in Chapter 4.

a) When studying the division algorithm, we learned that for all a, b € Z, where b > 0,
                                         it was possible to find unique g, r € Z witha = gb+ r and0 <r < b. Now we may
                                         add thatg = | ¢ | andr =a — [ ¢ | b.
                                   b) In Example 4.44 we found that the positive integer
                                                                                     29,338,848,000 = 283°5°7711
                                         has

60 = (5)(3)(2)(2)
                                                (5)(3)(2)(2)(1)1)              =
                                                                                      84+)D))/S64+)D)—G+D)/64+D)])/04+)
                                                                                            5      ||               5                 5           5           5

positive divisors that are perfect squares. In general, if n € Z* with n > 1, we know
                                         that we can write
                                                                                                 n= pips                ++:   pe

wherek € Z*, p; isprime forall 1 <i <k, p; # p;foralll <i < j <k,ande; €Z*
                                         for all 1 <i <k. This is due to the Fundamental Theorem of Arithmetic. Then if
                                         r €Z", we find that the number of positive divisors of n that are perfect rth powers
                                             k    ep+ 1
                                                                              k              k               :   ‘
                                         S I]                       | When      r = | we get I]                fe, +1]         = I le      + 1), which is the number
                                               i=l       r                                             i=1                          i=]
                                         of positive divisors of n.
                                                                     5.2   Functions: Plain and One-to-One           255

In Sections 4.1 and 4.2 we were introduced to the concept of a sequence in conjunction
EXAMPLE 5.12     |   with our study of recursive definitions. We should now realize that a sequence of real
                     numbers r}, r2, 73, ... can be thought of as a function f: Z* > R where f(n) = rp, for all
                     n € Z*. Likewise, an integer sequence do, a), 42, . . .can be defined by means of a function
                     g:N — Z where g(n) = a,, foralln EN.

In Example 5.9 there are 2!* = 4096 relations from A to B. We have examined one
                     function among these relations, and now we wish to count the total number of functions
                     from A to B.

For the general case, let A, B be nonempty sets with |A| = m, |B| = n. Consequently,
                       if A = {@1, a2, 43,...,@,}    and   B = {b;, bo, b3,..., by},      then     a typical   function
                       f: A-» B can be described by {(a;, x1), (G2, X2), (@3, X3),- +. + (ms %m)}. We can
                       select any of the n elements of B for x, and then do the same for x.. (We can se-
                       lect any element-of B for x. so that the same element of B may be selected for both x,
                       and x2.) We continue this selection process until one of the n elements of B is finally
                       selected for x. In this way, using the rule of product, there are n™ = {B|!4! functions
                       from A to B.

Therefore, for A, B in Example 5.9, there are 47 = |B|'4! = 64 functions from A to B,
                     and 34 = |A|!4! = 81 functions from B to A. In general, we do not expect |A]!4! to equal
                     |B|!4!, Unlike the situation for relations, we cannot always obtain a function from B to A
                     by simply interchanging the components in the ordered pairs of a function from A to B (or
                     vice versa).

Now that we have the concept of a function as a special type of relation, we turn our
                     attention to a special type of function.

Definition 5.5       A function f: A — B is called one-to-one, or injective, if each element of B appears at
                     most once as the image of an element of A.

If f: A > B is one-to-one, with A, B finite, we must have |A| < | B|. For arbitrary sets
                     A, B, f: A— B 1s one-to-one if and only if for all a), a2 € A, f(ay) = f(a) > a, = a@.

Consider the function f: R >   R where f(x) = 3x + 7forallx € R. Then for all.x,, x. € R,
EXAMPLE 5.13
                     we find that

f(x) = f(x2) > 3x) +7 = 3x2 +7
                                                                 = 3x) = 3x2 > x) = XD,
                     so the given function f is one-to-one.
                        On the other hand, suppose that g: R > R is the function defined by g(x) = x* — x for
                     each real number x. Then

g(0)=(0)*-0=0        and    g(l)=(1)*-(1)=1-1=0.
                     Consequently, g is not one-to-one, since g(0) = g(1) butO # 1 —thatis, g isnot one-to-one
                     because there exist real numbers x;, x2 where g(x,) = g(x%2) HX;            = Xo.
256          Chapter 5 Relations and Functions

Let A = {1, 2, 3} and B = {1, 2, 3, 4, 5}. The function
      EXAMPLE 5.14
                                                                f ={d, 1), (2, 3), 3, 4}
                              is a one-to-one function from A to B;

gs ={(1, 1), @, 3), GB, 3)}
                              is a function from A to B, but it fails to be one-to-one because g(2) = g(3) but 2 # 3.

For A, B in Example 5.14 there are 2'° relations from A to B and 5° of these are functions
                              from A to B. The next question we want to answer is how many functions f: A — B are
                              one-to-one. Again we argue for general finite sets.

With   A = {aj, a2, 43, ..., Gm}, B = {by, bo, b3,..., bg},    and   m <n,   a one-to-one
                                function f: A—> B has the form ‘{(a1, x1), (a2, ¥2), (a3..x3),..., (Gm, Xm)}, where
                                there are n choices for x; (that is, any element of B), n — 1 choices for x2 (that is,
                                any element of B except the one chosen for x;), n — 2 choices for x3, and so on, finish-
                                ing with n — (m — 1) =n —m + 1 choices for x,,. By the rule of product, the number
                                of one-to-one functions from A to B is

n(n ~D(n—2)---a@—m+)=                          ;= Pin, m) = PBI,      Af).
                                                                              (n —m)

Consequently, for A, B in Example 5.14, there are 5-4-3        = 60 one-to-one functions
                              f: A> B.

Definition 5.6          If f: A—     Band   A; CA, then

f(A,) = {b € Bib = f(a), for some a € Aj},

and f(A}) is called the image of A, under f.

For A = {1, 2,3, 4,5} and B = {w, x, y, z}, let f: A— B be given by f = {(1, w),
      EXAMPLE 5.15
                              (2, x), (3, x), (4, y), (5, y)}. Then for A; = {1}, Az = {1, 2}, A3 = {1, 2, 3}, Aq = {2, 3},
                              and As   = {2, 3, 4, 5}, we find the following corresponding images under f:

F(A1) = {f(@la € Ai} = {f(@la € {1}} = {f(@la = 1} = {fC} = {w}:
                                f(A2) = {f(@la ¢ Ao} = {f(@la € {1, 2}} = {f@la = 1 or 2}
                                      ={f1), f(2)} = tw, x};
                                f(A3) ={f 0), f(2), FG)} = {w, x}, and f(A3) = f(A2) because f(2) = x = f(3);
                                f (Aq) = {x}; and f(As) = {x, y}.

a) Let g:R — R be given by g(x) = x*. Then g(R) = the range of g = [0, +00). The
      EXAMPLE 5.16
                                   image of Z under g is g(Z) = {0, 1, 4, 9, 16,...}, and for A, = [—2, 1] we get
                                    g(A;) = [0, 4].
                                                                           5.2   Functions: Plain and One-to-One       257

b) Let h: ZX Z— Z@ where h(x, y) = 2x +3y. The domain of h is Z X Z, not Z,
                         and the codomain is Z. We find, for example, that 4(0, 0) = 2(0) + 3(0) = 0 and
                         h(—3, 7) = 2(-3) + 3(7) = 15. In addition, h(2, —1) = 2(2) + 3(—1) = 1, and for
                         each n € Z, h(2n, —n) = 2(2n) + 3(—n) = 4n — 3n = n. Consequently, 4(Z X Z)
                              = the range of h = Z. For A, = {(0, n)|n € Zt} = {0} X Z* CZ XZ, the image
                              of A; underh is h(A;) = (3, 6,9,...} = Baln € Z*}.

Our next result deals with the interplay between the images of subsets (of the domain)
                     under a function f and the set operations of union and intersection.

THEOREM 5.2          Let f: A—       B, with A;, Ap C A. Then

a) f(A; U Ao) = f(A1) U fF (A2);                    b) f(A        A2) © f(A) 9 f{A2);
                       ¢) f(A, 9 A2) = f(A1) MN f(A2) when f is one-to-one.

Proof: We prove part (b) and leave the remaining parts for the reader.
                         For each be B, be f(A; NA2)        >    b= fla), for some ae A;                N Ar> [b= fla)
                     for some a € A;] and [b= f(a) for some aé€ Ap] > be f(A)                            and be f{A2) >
                     be f(A1)O f(A2), so f(A1 1 Az) © f{Ay) A f{A2).

Definition 5.7   If f: A—       B and A; CA,      then f|,4,: A} > B is called the restriction of f to A, if
                     fla,(a@) = f(a) for alla € A,.

Definition 5.8   Let A; C A and f: A, > B.If g:          A—    B and g(a) = f(a) for all a € Aj, then we call g
                     an extension of f to A.

ForA = {1, 2, 3, 4, 5}, let f: A > R be defined by f = {(1, 10), (2, 13), (3, 16), (4, 19),
   EXAMPLE 5.17
                     (5, 22)}. Let g:Q— R where g(g) = 3g +7 for all g €Q. Finally, let kh: R > R with
                     h(r) = 3r + 7 for allr € R. Then

i) g is an extension of f (from A) to Q;
                        ii)     f is the restriction of g (from Q) to A;
                        iii) / is an extension of f (from A) to R;
                        iv)     f is the restriction of h (from R) to A;
                         v)     his an extension of g (from Q) to R; and
                        vi)     g is the restriction of h (from R) to Q.

LetA = {w, x, y, z}, B = {1, 2, 3, 4, 5}, and A; = {w, y, z}. Let f: A>                  B, g: Ai >   B
   EXAMPLE 5.18      be represented by the diagrams in Fig. 5.5. Then g = f|,, and f is an extension of g from
                     A; to A. We note that for the given function g: A; — B, there are five ways to extend g
                     from A, to A.
258             Chapter 5 Relations and Functions

f:A>B                                  g:A,—~>8

1                  a           1
                                                                   >a

w                                    Ww
                                                                                       @?                             e?

x
                                                                                 ae

3                         03

y                                    y

4                                4
                                                              Zz                                   Zz

5                            5
                                                              Figure 5.5

——_
MM                                                                                      i) ANB                                  ii) BNC
1. Determine whether or not each of the following relations is                        iii) AUC                                 iv) BUC
a function. If a relation is a function, find its range.                         b) How are the answers for (i)-(iv) affected if A, B, C C
      a) {(x, y)|x, y eZ, y =x? +7}, arelation from Z to Z                       Z*xZt?
      b) {(x, y)|x, y ER, y? = x), arelation from R toR                     7. Determine each of the following:
      c) {(x, y)|x, y €R, y = 3x + 1}, a relation from R toR                     a)    [2.3 — 1.6]            b)   [2.3] — [1.6]          ce) [3.4]|6.2]
      d) {(x, y)|x, y €Q, xe + y? = ]}, a relation from Q to Q                   d)    |3.4| [6.2]             e) [27]                    f) 2[2]
      e) Ris arelation from A to B where |A| = 5, |B| = 6, and               8. Determine whether each of the following statements is true
      IR| = 6.                                                             or false. If the statement is false, provide a counterexample.
2. Does the formula f(x) = 1/(x? — 2) define a function                         a)    |a| = [a] for alla € Z.
f:R— R?A function f: Z—> R?                                                      b)    la] = [a] forallaeR.
  3. Let A = {1, 2, 3, 4} and B = {x, y, z}. (a) List five func-                 c)    [a] = [a] —1foralaeR-Z.
tions from A to B. (b) How many functions f: A — B are there?                    d) —[a]          = [—a] foralla eR.
(c) How many functions f: A > B are one-to-one? (d) How
                                                                            9. Find all real numbers x such that
many functions g: B — A are there? (e) How many functions
g: B > A are one-to-one? (f) How many functions f: A> B                          a) 71x] = [7x]                            b)    [7x] =7
satisfy f(1) = x? (g) How many functions f: A > B satisfy                        c) |x +7) =x+7                            d) [x +7]
                                                                                                                                  = |x]4+7
fC) = f(2) = x? (h) How many functions f: A > B satisfy                    10. Determine all x € R such that |x| + |x + 5! = [2x].
fC) = x and f(2) = y?
                                                                           11. a) Find all real numbers x where [3x] = 3[x].
  4. If there are 2187 functions f: A —       B and |B] = 3, what                b) Letn € Z* where n > 1. Determine all x € R such that
is |A|?
                                                                                 [nx] =n[x].
5. Let A, B,C CR’ where A = {(x, y)|y =2x + 1}, B=                        12. Forn, k € Z*, prove that [n/k] = [(n — 1)/k] +1.
{(x, y)|y = 3x}, and C = {(x, y}|x — y = 7}. Determine each
                                                                           13. a) Let a €R* where a > 1. Prove that (i) [fa] fa] = 1;
of the following:
                                                                                 and (ii) [la] /a] = 1.
      a) ANB                         b) BNC
                                                                                 b) If ae R* and 0 <a <1, which result(s) in part (a) is
      ce) AUC                        d) BUC                                      (are) true?
6. Let A, B, C CZ’        where A = {(x, y)/y = 2x +1}, B=                14.   Let   a,       G2,Q3,..    . be the integer sequence defined recur-
{(x, y)ly = 3x}, and C = {(x, y)|x — y = 7}.                               sively by
                                                                                          5.2        Functions: Plain and One-to-One                         259

1) a; = 1; and                                                  of A     is stored        in     locations       1, 2,3,...,m,           respectively,     of

2) For all n € Z* wheren > 2, ay = 2ajn/2).                     the array,    when          a;    18 stored       in location        1. Then     the entries
                                                                    4,2, 1 <i <m, of the second column of A are stored in loca-
         a) Determine a, for all 2 <n <8.
                                                                    tionsm+1,m+2,m+4+3,..., 2m, respectively, of the array,
         b) Prove that a, <n foralln eZ.                            and so on. Find a formula for the access function g(a;,) under
15. For each of the following functions, determine whether it       these conditions.
is One-to-one and determine its range.                              25. a) Let A be anm X n matrix that is to be stored (in a con-
                                                                        tiguous manner) in a one-dimensional array of r entries.
    a) f:Z—
        Z, f(x) = 2x41
                                                                        Find a formula for the access function if aj, is to be stored
    b) f:Q0>
         Q, f(x) =2x41                                                  in location k (= 1) of the array [as opposed to location 1 as
    ce) f: ZZ,
            f(x) =x -—x                                                 in Example 5.10(d)] and we use (i) the row major imple-
    d) f:R-R,           f(x) =e*                                        mentation; (ii) the column major implementation.

e) f:[-72/2, 7/2]
                    > R, f(x)
                           = sinx                                          b) State any conditions involving m,n, r, and k that must
                                                                           be satisfied in order for the results in part (a) to be valid.
    f) f: [0,7] > R, f(x) =sinx
                                                                    26. The following exercise provides a combinatorial proof for
16. Let f: R—> R where f(x) = x”. Determine f(A) for the
                                                                    a summation formula we have seen in four earlier results:
following subsets A taken from the domain R.
                                                                    (1) Exercise 22 in Section 1.4; (2) Example 4.4; (3) Exercise 3
    a) A = {2, 3}                   b) A = {-—3, —2, 2, 3}          in Section 4.1; and (4) Exercise 19 in Section 4.2.
    c) A=     (-3, 3)              d) A = (—3, 2]                      Let     A =    {a,b,c},
                                                                                        B = {1,2,3,...,n,n4+1},                                        and   S=

e) A=[-7, 2]                    f) A = (—4, —3] VIS, 6]         {f: A — B\| f(a) < f(c) and f(b) < f(c)}.
17. Let A = {1, 2,3,4, 5}, B = {w, x, y, z}, Ar = {2, 3, 5}                a) IfS; ={f: A—                       B|f € Sand f(c) = 2}, what is |S)|?
C A, and g: A; > B. In how many ways can g be extended                     b) IfS. = {f:A—                       B|f € Sand f(c) = 3}, what is |S>|?
toa function f: A—> B?                                                      c) For 1 <i <n, let S$, ={f:A—                          B\f €S and             f(c)=
18. Give an example ofa function f: A— Band A;, Ay CA                      i + 1}. What is |S,|?
for which f(A, 9 Az) # f(A1) NM f(A2). [Thus the inclusion                 d) Let 7, = {f: A—                     B|f € Sand f(a) = f(b)}. Explain
in Theorem 5.2(b) may be proper.]                                          why |7;| = ("3").
19. Prove parts (a) and (c) of Theorem 5,2.                                e) LetT, = {f:A— B|f € Sand f(a) < f(b)} and 7; =
20. If A = {1, 2, 3, 4, 5} and there are 6720 injective functions          {f:A— B\f eS and f(a) > f(b)}. Explain why |7>| =
f: A— B, what is |B|?                                                      IT3| = ("5').
21. Let f: A— B, where A= X UY with XN Y =@.If f|x                         f) What can we conclude about the sets
and f|y are one-to-one, does it follow that f is one-to-one?                         S,   U     Sz    U    $83   U---U S,   and     7,   U   7,   U 73?

22. For ne Z*        define X, = {1,2,3,...,n}.    Given   mneé            g) Use the results from parts (c), (d), (e), and (f) to verify
Z*, f: Xm — X,, is called monotone increasing if for alli, j €             that
Xm, 1 <i<j<m=>          fi) < fV). (a) How many monotone                                   S37 _ a(n + 1)(2n + 1)
increasing functions are there with domain X; and codomain
                                                                                                     1=1                    6
X5? (b) Answer part (a) for the domain X, and codomain X9.
                                                                    27. One version of Ackermann’s function A(m,n) is defined re-
(c) Generalize the results in parts (a) and (b). (d) Determine
                                                                    cursively for m,n € N by
the number of monotone increasing functions f: X\y > X¢
where f(4) = 4. (e) How many monotone increasing functions                     A(O,n) =n+1,n>0;
ff: X7— Xq2 satisfy f(5) = 9? (f) Generalize the results in                    A(m, 0) = A(m — 1, 1), m > 0; and
parts (d) and (e).                                                             A(m,n) = A(m — 1, A(m,n— 1)), m,n >                                    0.

23. Determine the access function f (a;,), as described in Ex-      [Such functions were defined in the 1920s by the German math-
ample 5.10(d), for a matrix A = (4,;)mxn. where (a) m = 12,         ematician and logician Wilhelm Ackermann (1896-1962), who
n= 12; (b)m =7,n = 10; (c)m = 10,n = 7.                             was a student of David Hilbert (1862-1943). These functions
                                                                    play an important role in computer science — in the theory of re-
24. For the access function developed in Example 5.10(d),
                                                                    cursive functions and in the analysis of algorithms that involve
the matrix A = (4,,),.xn Was stored in a one-dimensional ar-
                                                                    the union of sets.]
ray using the row major implementation. It is also possi-
ble to store this matrix using the column major implemen-                  a) Calculate A(1, 3) and A(2, 3).

tation, where each entry a,;, 1 <i <m, in the first column                 b) Prove that A(1, 2) =n +2 foralln EN.
260           Chapter 5 Relations and Functions

c) For all n € N show that A(2, n) = 3 + 2n.                        thought of as a partial function. The program’s input is the
      d) Verify that A(3,n) = 2”*3 — 3 foralln EN.                        input for the partial function and the program’s output is the
                                                                          output of the function. Should the program fail to terminate, or
28. Given sets A, B, we define a partial function f with do-
                                                                          terminate abnormally (perhaps, because of an attempt to divide
main A and codomain B as a function from A’ to B, where f #
                                                                          by 0), then the partial function is considered to be undefined
A' Cc A. [Here f (x) isnot defined forx € A — A’.] Forexample,
                                                                          for that input. (a) For A = {1, 2, 3,4, 5}, B = {w, x, y, z},
f:R* > R, where f (x) = 1/x, isa partial function on R since
                                                                          how many partial functions have domain A and codomain B?
f (Q) is not defined. On the finite side, {(1, x), (2, x), (3, y)} is
                                                                          (b) Let A, B be sets where |A] =m >0,|B| =n >0. How
a partial function for domain A = {1, 2, 3, 4, 5} and codomain
                                                                          many partial functions have domain A and codomain B?
B ={w, x, y, 2}. Furthermore, a computer program may be

5.3
        Onto Functions: Stirling Numbers
               of the Second Kind
                                  The results we develop in this section will provide the answers to the first five problems
                                  stated at the beginning of this chapter. We find that the onto function is the key to all of the
                                  answers.

Definition 5.9             A function f: A > B ts called onto, or surjective, if f(A) = B —that is, if for all be B
                                  there is at least one a € A with f(a) = b.

EXAMPLE 5.19          |     The function f: R > R defined by f(x) = x? is an onto function. For here we find that if r
                                  is any real number in the codomain of f, then the real number./r is in the domain of f and
                                  f(r) = (/r) = r. Hence the codomain of f = R = the range of f, and the function f
                                  is onto.
                                      The function g: R > R, where g(x) = x? for each real number  x, is not an onto function.
                                  In this case no negative real number appears in the range of g. For example, for —9 to be
                                  in the range of g, we would have to be able to find a real number r with g(r) = r* = —9.
                                  Unfortunately,  r* = —9 > r = 3i orr = —3i, where 3i, —3i €C, but 3i, —3i ¢ R. Sohere
                                  the range of g = g(R) = [0, +00) CR, and the function g is not onto. Note, however, that
                                  the function h: R >      [0, +00) defined by h(x) = x? is an onto function.

| EXAMPLE 5.20 _|                 Consider the function f: Z—
                                  f={...,—-8,       -5, -2,
                                                                        Z where f(x) = 3x + 1 for each x € Z. Here the range of
                                                                1,4, 7,...} C¢ Z, so f is not an onto function. If we examine the
                                  situation here a little more closely, we find that the integer 8, for example, is not in the range
                                  of f even though the equation

3x +1=8

can be easily solved — giving us x = 7/3. But that is the problem, for the rational number
                                 7/3 is not an integer— so there is no x in the domain Z with f(x) = 8.
                                      On the other hand, each of the functions

1) g:Q—-     Q, where g(x) = 3x + 1 forx € Q; and
                                      2) h: R—>    R, where h(x) = 3x +1 forx ER
                                                 5.3 Onto Functions: Stirling Numbers of the Second Kind        261

is an onto function. Furthermore, 3x; + 1 = 3x2 + 1 => 3x, = 3x2 => x; = X2, regardless
               of whether
                        x, and x2 are integers, rational numbers, or real numbers. Consequently, all three
               of the functions f, g, and h are one-to-one.

IfA = {1, 2, 3, 4} and B = {x, y, z}, then
EXAMPLE 5.21
                     fi ={d, 2), 2, y), 3.x), 4, y)}           and     fo = {, x), (2, x), (3, y), (4, 2}

are both functions from A onto B. However, the functiong = {(1, x), (2, x), (3, y), (4, y)}
               is not onto, because g{A) = {x, y} Cc B.

If A, B are finite sets, then for an onto function f: A — B to possibly exist we must have
               |A| => |B]. Considering the development in the first two sections of this chapter, the reader
               undoubtedly feels it is time once again to use the rule of product and count the number
               of onto functions f: A — B where |A| = m >n = |B|. Unfortunately, the rule of product
               proves inadequate here. We shall obtain the needed result for some specific examples and
               then conjecture a general formula. In Chapter 8 we shall establish the conjecture using the
               Principle of Inclusion and Exclusion.

IfA = {x, y, z} and B = {1, 2}, then all functions f: A > B are onto except fi; = {(x, 1),
EXAMPLE 5.22
               (vy, 1), (z, 1}, and fo = {(x, 2), (y, 2), (z, 2)}, the constant functions. So there are
               |B|'4| — 2 = 23 — 2 = 6 onto functions from A to B.
                  In general, if |A| = m > 2 and |B| = 2, then there are 2” — 2 onto functions from A to
               B. (Does this formula tell us anything when m = 1?)

For A = {w, x, y, z} and B = {1, 2, 3}, there are 34 functions from A to B. Considering
EXAMPLE 5.23
               subsets of B of size 2, there are 2* functions from A to {1, 2}, 2* functions from A to
               {2, 3}, and 24 functions from A to {1, 3}. So we have 3(2*) = (5)24 functions from A to
               B that are definitely not onto. However, before we acknowledge 3* — (3)2* as the final
               answer, we must realize that not all of these (3)2* functions are distinct. For when we
               consider all the functions from     A to {1,2}, we are removing,         among    these, the function
               {(w, 2), (x, 2), Cy, 2), (z, 2)}. Then, considering the functions from A to {2, 3}, we remove
               the same function: {(w, 2), (x, 2), (y, 2), (z, 2)}. Consequently, in the result 34 — (3)2*,
               we have twice removed each of the constant functions f: A —> B, where f(A) is one
               of the sets {1}, {2}, or {3}. Adjusting our present result for this, we find that there are
               3* — (3)2* + 3 = ()3* — (3)2* + (7)1* = 36 onto functions from A to B.
                  Keeping B = {1, 2, 3}, for any setA with |A| = m > 3, there are (3)3” — (3)2" + G)i"
               functions from A onto B. (What result does this formula yield when m = 1? whenm = 27)

The last two examples suggest a pattern that we now state, without proof, as our general
               formula.
262         Chapter 5 Relations and Functions

For finite sets A, B with |A| = m and |B| =n, there are

(rer (2 m r eG, 2a)ena
                                       n    mo          n        —       ty"   n        ~~ FY    we

a~t
                                                   +(-1    (3)2 +(-1pyr f (7)\ ym
                                                     _yyn-2{ * \ om
                                                                                                 2 1)kt("Je
                                                                                                         ®           _ Eym
                                                                                                                           k)
                                                                                           *
                                                                                                2,: 1%
                                                                                                    1) (,"a )        _.
                                                                                                                          k) py
                               onto functions from A to B,

Let A = {1, 2,3,4,5,6, 7} and B = {w, x, y, z}. Applying                  the general formula with
      EXAMPLE 5.24           m = 7 andn         = 4, we find that there are

(he -C+Oe-
                                     Oe Berne                        4
                                                                               4
                                                              = yo(-p (,           ‘) (4 — k)’ = 8400 functions from A onto B.

The result in Example 5.24 is also the answer to the first three questions proposed at the
                             start of this chapter. Once we remove the unnecessary vocabulary, we recognize that in all
                             three cases we want to distribute seven different objects into four distinct containers with
                             no container left empty. We can do this in terms of onto functions.
                                 For Problem 4 we have a sample space & consisting of the 47 = 16,384 ways in which
                             seven people can each select one of the four floors. (Note that 4’ is also the total number
                             of functions f: A— B where |A| = 7,|B| = 4.) The event that we are concerned with
                             contains 8400 of those selections, so the probability that the elevator must stop at every
                             floor is 8400/16384 = 0.5127, slightly more than half of the time.
                                 Finally, for Problem 5, since }°;_(—1)*(,,",)(n — k)” is the number of onto functions
                             f: A— B for|A| =m,             |B| =n, for the case where m < n there are no such functions and
                             the summation is 0.
                                 Problem 6 will be addressed in Section 5.6.
                                 Before going on to anything new, however, we consider one more problem.

At the CH Company, Joan, the supervisor, has a secretary, Teresa, and three other adminis-
      EXAMPLE 5.25
                             trative assistants. If seven accounts must be processed, in how many ways can Joan assign
                             the accounts so that each assistant works on at least one account and Teresa’s work includes
                             the most expensive account?
                                 First and foremost, the answer is not 8400 as in Example 5.24. Here we must consider
                             two disjoint subcases and then apply the rule of sum.
                                a) If Teresa, the secretary, works only on the most expensive account, then the other
                                   six accounts can be distributed among the three administrative assistants in
                                    Vo ieo(—D* (,3,)8 ~— 4° = 540 ways. (540 = the number of onto functions
                                    f: A—       B with |A| = 6, |B] = 3.)
                                                 5.3 Onto Functions: Stirling Numbers of the Second Kind         263

b) If Teresa does more than just the most expensive account, the assignments can be made
                      in \of_o(-1)*(44,,)(4 — &)® = 1560 ways. (1560 = the number of onto functions
                      g:C > Dwith|C|        = 6, |D| =4.)

Consequently, the assignments can be given under the prescribed conditions in 540 +
                 1560 = 2100 ways. [We mentioned earlier that the answer would not be 8400, but it is
                 (1/4)(8400) = (1/|B|)(8400), where 8400 is the number of onto functions f: A > B,
                 with |A| = 7 and |B| = 4. This is no coincidence, as we shall learn when we discuss
                 Theorem 5.3.]

We now continue our discussion with the distribution of distinct objects into containers
                 with none left empty, but now the containers become identical.

If A = {a, b, c, d} and B = {1, 2, 3}, then there are 36 onto functions from              A to B or,
EXAMPLE 5.26     equivalently, 36 ways to distribute four distinct objects into three distinguishable containers,
                 with no container empty (and no regard for the location of objects in a given container).
                 Among these 36 distributions we find the following collection of six (one of six such possible
                 collections of six):

1) {a,b};     {c}2 — (d}s                       2) {a,b}; {d}2         {e}3
                    3) {c}i      fa, b}n (d}3                       Nich      {d}o        {a, d}s
                    5) {d}i      (a, b}n {ec}                       6) {dhi — {c}2_       fa, D3,
                 where, for example, the notation {c}2 means that ¢ is in the second container. Now if
                 we no longer distinguish the containers, these 6 = 3! distributions become identical, so
                 there are 36/(3!) = 6 ways to distribute the distinct objects a, b, c, d among three identical
                 containers, leaving no container empty.

For m > n there are )7j.9(—1)*(,,",)(n — &)" ways to distribute m distinct objects into
                   n numbered (but otherwise identical) containers with no container left empty. Removing
                   the numbers on the containers, so that they are now identical in appearance, we find
                   that one distribution inte these n (nonempty) identical containers corresponds with n!
                   such distributions into the numbered containers. So the number of ways in which it is
                   possible to distribute the m distinct objects into n identical containers, with no container
                   left empty, is                             °

Ani ynt
                                                    A   (”n-k Jer              ky”
                   This will be denoted by S(m, ) and is called a Stirling number of the second kind.
                      We note that for [A] = m >n = |B|, there are n! - S(m, n) onto functions from A
                   to B.

Table 5.1 lists some Stirling numbers of the second kind.

For m >n, >-;_, S(m, i) is the number of possible ways to distribute m distinct objects
EXAMPLE 5.27 |   into n identical containers with empty containers allowed. From the fourth row of Table 5.1
264      Chapter 5 Relations and Functions

Table 5.1

S(m, n)
                                  mm |                       2        3              4                 5             6     7         8
                                    ]        1
                                    2        |1               1
                                    3        1            3           1
                                    4        1            7           6               1
                                    5        |1          15          25           10                       1
                                    6        1           31          90           65                  15              ]
                                    7        1          63         301          350                  140         21            1
                                    8        1         127         966         1701                 1050        266       28         1

we see that there are 1 + 7+ 6 = 14 ways to distribute the objects a, b, c, d among three
                          identical containers, with some container(s) possibly empty.

We continue now with the derivation of an identity involving Stirling numbers of the
                          second kind. The proof is combinatorial in nature.

THEOREM 5.3               Let m, n be positive integers with | <n <m. Then

S(m+1,n)      = S(m,n—1)4+nS(m,                   n).
                          Proof: Let A = {a), a2,..., Gm, Gm4i}. Then                     S(m + 1, n) counts the number            of ways in
                          which the objects of A can be distributed among n identical containers, with no container
                          left empty.
                             There are S(m,n          — 1) ways of distributing a), a2, ..., @, among n — 1 identical con-
                          tainers, with none left empty. Then, placing a,,,, in the remaining empty container results
                          in S(m, n — 1) of the distributions counted in S(m + 1, 2) —namely, those distributions
                          where a+) is in a container by itself. Alternatively, distributing a;, a2, ... , @,, among the
                          n identical containers with none left empty, we have S(m, n) distributions. Now, however,
                          for each of these S(m, n) distributions the x containers become distinguished by their con-
                          tents. Selecting one of the n distinct containers for a4), we have nS(m, n) distributions
                          of the total S(m + 1, n) —namely,               those where a,,,;           is in the same container as another
                          object from A. The result then follows by the rule of sum.

To illustrate Theorem 5.3 consider the triangle shown in Table 5.1. Here the largest num-
                          ber corresponds         with S(m-+      1,7”), for m = 7         and n = 3, and we see that $(7 + 1, 3) =
                          966 = 63 + 3(301) = S(7, 2) + 3S(7, 3). The identity in Theorem 5.3 can be used to ex-
                          tend Table 5.1 if necessary.
                              If we multiply the result in Theorem 5.3 by (x — 1)! we have

(<)         [n!S(m + 1,)] = [Cn — 1)!S(m,n — 1] 4+ [n!SQn,n)].
                                                                            5.3 Onto Functions: Stirling Numbers of the Second Kind                 265

This new form of the equation tells us something about numbers of onto functions. If
                               A   =       {a@|, @2,.-.,    Gm,   Gm4i}   and B    =   (by, bo, ...,   by_j,   b,} with m   > n —   1, then

l
                                (;)         (The number of onto functions h: A >               B)
                                  n
                                                                          = (The number of onto functions f: A — {@m41} > B—                    {b,})
                                                                            + (The number of onto functions g: A — {@,41} >                   B).

Thus the relationship at the end of Example 5.25 is not just a coincidence.

We close this section with an application that deals with a counting problem in which the
                               Stirling numbers of the second kind are used in conjunction with the Fundamental Theorem
                               of Arithmetic.

Consider the positive integer 30,030 = 2 X 3 X 5 X 7 X 11 X 13. Among the unordered
    EXAMPLE 5.28
                               factorizations of this number one finds

i)     30X 1001 = (2X3 5)(7X 11 X 13)
                                    ii)      110 X 273 = (2X 5X 11)3X7      X 13)
                                   iii)      2310 X 13 = (2X3 X5X7X       11)(13)
                                   iv)       14 X 33 X 65 = (2 X 7)(3 X 11)(5 X 13)
                                     vy)     22 X 35 x 39 = (2 X 11)(5 X 7)(3 X 13)
                               The results given in (i), (ii), and (111) demonstrate three of the ways to distribute the six
                               distinct objects 2, 3,5, 7, 11, 13 into two identical containers with no container left empty. So
                               these first three examples are three of the $(6, 2) = 31 unordered two-factor factorizations
                               of 30,030 — that is, there are $(6, 2) ways to factor 30,030 as mn where m,n € Z* for
                               1 < m,n < 30,030 and where order is not relevant. Likewise, the results in (iv) and (v) are
                               two of the $(6, 3) = 90 unordered ways to factor 30,030 into three integer factors, each
                               greater than 1. If we want at least two factors (greater than 1) in each of these unordered
                               factorizations, then we find that there are }°°_, S(6, i) = 202 such factorizations. If we
                               want to include the one-factor factorization 30,030
                                                                               — where                            we distribute the six distinct
                               objects 2,3, 5,7, 11, 13 into one (identical) container — then we have 203 such factorizations
                               in total.

3. For each of the following functions g: R > R, determine
                                                                               whether the function is one-to-one and whether it is onto. If the
                                                                               function is not onto, determine the range g(R).
  1. Give an example of finite sets A and B with JA], |B| > 4
and a function f: A — B such that (a) f is neither one-to-one                          a) g(x) =x+7                  b) g(x)
                                                                                                                          = 2x —3
nor onto; (b) f is one-to-one but not onto; (c) f is onto but not                      c) g(x) = —x +5               d) g(x) =x?
one-to-one; (d) f is onto and one-to-one.                                              e) g(x) =x? +x                f) gx) =x
                                                                                 4. Let A = {1, 2,3, 4} and B = {1, 2, 3, 4, 5, 6}. (a) How
2. For each of the following functions f: Z—              Z, determine
                                                                               many functions are there from A to B? How many of these
whether the function is one-to-one and whether it is onto. If the
                                                                               are one-to-one? How many are onto? (b) How many functions
function is not onto, determine the range f (Z).
                                                                               are there from B to A? How many of these are onto? How many
    a) f(xy) =x4+7                 b) f(x) =2x         -3                      are one-to-one?
    c) f(x) =—-x +5                d) f(x) =x?                                    5. Verify that }°;_)(—1)*(,",)(@ — k)” =0 for n =5 and
    e) f(x) =x? 4+x                f) fxy=x                                    m = 2, 3,4.
266               Chapter 5 Relations and Functions

6. a) Verify that 5’ = $°°_, F)G)SC, i).                                       or more factors, each greater than 1, where the order of the

b) Provide a combinatorial argument to prove that for all                factors is not relevant?
      mneZt,                                                             14. Write a computer program (or develop an algorithm) to
                                                                         compute the Stirling numbers S(m, n) when 1 < m < 12 and
                            na         my,     :
                        m        D   (7 Jansen.                          l<n<m.

15. A lock has n buttons labeled 1, 2, . .. , 2. To open this lock
7. a) Let A= {1,2,3,4,5,6, 7} and B= {v, w, x, y, Z}.
                                                                         we press each of the n buttons exactly once. If no two or more
    Determine the number of functions f: A — B where (i)
                                                                         buttons may be pressed simultaneously, then there are n! ways
      F(A) = {v, x}; Gi) | f(A)| = 2; (i) f(A) = {w, x, y}s Gv)          to do this. However, if one may press two or more buttons si-
      | f(A)| = 3; (v) f(A) = tu, x, y, Zs and (vi) | f(A)| = 4.         multaneously, then there are more than n! ways to press all of
      b) Let A, Bbesets with |A| =m >n = |B. Ifk € Z* with               the buttons. For instance, if n = 3 there are six ways to press
      1<k <n, how many functions f: A — B are such that                  the buttons one at a time. But if one may also press two or more
      | f(A)| = k?                                                       buttons simultaneously, then we find 13 cases — namely,
  8. A chemist who has five assistants is engaged in a research
                                                                                 (1) 1,2,3              (2) 1,3,2                       (3) 2,1,3
project that calls for nine compounds that must be synthesized.
                                                                                 (4) 2,3, 1             (5) 3,1,2                       (6) 3,2,1
In how many ways can the chemist assign these syntheses to the
                                                                                 (7) {1, 2},3           (8) 3, {1, 2}                   (9) (1, 3},2
five assistants so that each is working on at least one synthesis?
                                                                               (10) 2, {1, 3}          (11) (2, 3},1                  (12) 1, {2, 3}
  9. Use the fact that every polynomial equation having real-                  (13) {1, 2, 3}.
number coefficients and odd degree has a real root in or-
der to show that the function f: R— R, defined by f(x) =                 (Here, for example, case (12) indicates that one presses button
                                                                         1 first and then buttons 2, 3 (together) second.] (a) How many
x° — 2x? + x, is an onto function. Is f one-to-one?
                                                                         ways are there to press the buttons when n = 4? n = 5? How
10. Suppose we have seven different colored balls and four               many for 7 in general? (b) Suppose a lock has 15 buttons. To
containers numbered I, I], Il, and IV. (a) In how many ways              open this lock one must press 12 different buttons (one at a time,
can we distribute the balls so that no container is left empty?          or simultaneously in sets of two or more). In how many ways
(b) In this collection of seven colored balls, one of them is            can this be done?
blue. In how many ways can we distribute the balls so that no
container is empty and the blue ball is in container II? (c) If          16. At St. Xavier High School ten candidates C), Co, . . os                       Cio,

we remove the numbers from the containers so that we can no              run for senior class president.
longer distinguish them, in how many ways can we distribute                     a) How many outcomes are possible where (1) there are no
the seven colored balls among the four identical containers, with               ties (that is, no two, or more, candidates receive the same
some container(s) possibly empty?                                              number of votes? (11) ties are permitted? [Here we may
                                                                               have    an   outcome   such   as   {C,    C3,   C3},     {C,   C4,   Co,   Cio},
11. Determine the next two rows (m = 9, 10) of Table 5.1 for
the Stirling numbers S(m, n), where 1 <n < m.                                  {Cs}, {Cs, Cg}, where C>,C3,C; tie for first place,
                                                                               C,, C4, Co, Cio tie for fourth place, Cs is in eighth place,
12. a) Inhow many ways can 31,100,905 be factored into three                   and C,, Cg are tied for ninth place.] (iil) three candidates
      factors, each greater than 1, if the order of the factors is not
                                                                               tie for first place (and other ties are permitted)?
      relevant?
                                                                               b) How many of the outcomes in section (i1i) of part (a)
      b) Answer part (a), assuming      the order of the three factors         have C3 as one of the first-place candidates?
      is relevant.
                                                                               c) How many outcomes have C; in first place (alone, or
      c) In how many ways can one factor 31,100,905 into two                   tied with others)?
      or more factors where each factor is greater than ] and no
                                                                          17. Form, n, r € Z* withm > rn, let S,(m, n) denote the num-
      regard is paid to the order of the factors?
                                                                         ber of ways to distribute m distinct objects among n identical
      d) Answer part (c), assuming the order of the factors is to        containers where each container receives at least r of the ob-
      be taken into consideration.                                       jects. Verify that
13. a) How many two-factor unordered factorizations, where
      each factor is greater than 1, are there for 156,009?               S-m+ 1,2) =nS-(m,n)+ (                        m |) seem        +1l—-rn-1).
                                                                                                                   r
      b) In how many ways can 156,009 be factored into two
                                                                         18.   We     use s(m, n) to denote the number                 of ways      to seat m
      or more factors, each greater than 1, with no regard to the
                                                                         people at n circular tables with at least one person at each table.
      order of the factors?
                                                                         The arrangements at any one table are not distinguished if one
      c) Let pj, p2, p3,..., Pn be n distinct primes.         In how     can be rotated into another (as in Example !.16). The ordering
      many ways can one factor the product It,            P, into two    of the tables is not taken into account. For instance, the arrange-
                                                                                                   5.4 Special Functions     267

OQOC
ments in parts (a), (b), (c) of Fig. 5.6 are considered the same;
those in parts (a), (d), (e) are distinct (in pairs).
    The numbers s(m, 7) are referred to as the Stirling numbers
of the first kind.
    a)   If   > m, what is s(m, n)?

b) For m > 1, what are s(m, m) and s(m, 1)?
                                                                              (a)                      (b)

QQOC
    c) Determine s(m, m — 1) form           > 2.
    d) Show that for m > 3,
                            1
         s(m,m—2)=        (=)     m(m — 1)(m — 2)(3m — 1).

19. As in the previous exercise, s(m, 1) denotes a Stirling num-
ber of the first kind.
    a) Form >n >        } prove that
                                                                              (c)                      (d)

OC
         s(m,n) = (m — l)sQm — 1,n)4+s(m—1,n                —1).
    b) Verify that for m > 2,
                                              mal

v(m, 2) = (m=!               =
                                              r=]

(e)

Figure 5.6

5.4
                   Special Functions
                                 In Section 2 of Chapter 3 we mentioned that addition is a closed binary operation on the
                                 set Z*, whereas 1M is a closed binary operation on P(A) for any given universe UL. We also
                                 noted in that section that “taking the minus” of an integer is a unary operation on Z. Now it
                                 is time to make these notions of (closed) binary and unary operations more precise in terms
                                 of functions.

Definition 5.10             For any nonempty sets A, B, any function f: A X A > B is called a binary operation on
                                 A. If B CA,        then the binary operation is said to be closed (on A). (When    B C A we may
                                 also say that A is closed under f.)

Definition 5.11             A function g: A > A        is called a unary, or monary, operation on A.

|   EXAMPLE 5.29           |         a) The function f: Z x Z — Z, defined by f(a, b) = a — b, is aclosed binary operation
                                         on Z.
                                     b) Ifg: Z* X Z* —> Zis the function where g(a, b) = a — b, theng is a binary operation
                                         on Zt, but it is not closed. For example, we find that 3,7 ¢ Z*, but g(3, 7) =3-—7=
                                         —4¢Z".
                                       c) The function h: Rt — R* defined by h(a) = 1/a is a unary operation on R*.
268          Chapter 5 Relations and Functions

Let U be a universe, and let A, B CU.       (a) If f: POU) X POU) > APU)              is defined by
      EXAMPLE 5.30            f(A, B) = AUB,        then f is a closed binary operation on PU).            (b) The function
                              g: POU) > POU) defined by g(A) = A is a unary operation on P(A).

Definition 5.12         Let f: A X A —     B; that is, f is a binary operation on A.
                                a) f is said to be commutative if f(a, b) = f(b, a) forall (a,b)       Ee AX   A.
                                b) When B C A (that is, when f is closed), f is said to be associative if for all a, b,c €
                                    A, f(f (a, b), c) = fla, f(b, c)).

The binary operation of Example 5.30 is commutative and associative, whereas the binary
      EXAMPLE 5.31
                              operation in part (a) of Example 5.29 is neither.

EXAMPLE 5.32               a) Define the closed binary operation f: Z X Z— Z by f(a, b) =a+b— 3ab. Since
               .                    both the addition and the multiplication of integers are commutative binary operations,
                                    it follows that

f(a, b) =a+b—-3ab=b+a-—
                                                                        3ba = f(b, a),

so f is commutative.
                                        To determine whether f is associative, consider a, b, c € Z. Then

f(a,b)=a+b—3ab             and   f(f(a,b),c)       = fla, b)+c—3f
                                                                                                   (a, b)c

= (a+b -—3ab)+c—3(a4+
                                                                                                  b — 3ab)c
                                                                                 =a+b-+c—3ab
                                                                                       — 3ac — 3bc 4+ 9abe,

whereas
                                    f(b,c)=b+c-—3be           and    fa, f(b,c))=at+         f(b, c) —3af(b, c)
                                                                                 =a+(b4+c-—
                                                                                     3bc) — 3a(b+c — 3bc)

=a+b-+c
                                                                                   — 3ab — 3ac — 3bc + Yabe.
                                    Since f( f(a, b), c) = f(a, f(b, c)) for all a, b, c € Z, the closed binary operation
                                    f is associative as well as commutative.
                                 b) Consider the closed binary operation h: Z X Z— Z, where h(a, b) = a|b|. Then
                                    h(3, —2) = 3|— 2| = 3(2) = 6, but A(—2, 3) = —2|3| = —6. Consequently, h is not
                                    commutative. However, with regard to the associative property, if a, b, c € Z, we find
                                    that
                                                        A(h(a, b), c) = Ala, b)\c| = albl|c|     and
                                                        h(a, h(b, c)) = alh(b, c)| = alblc|| = allel,
                                    so the closed binary operation / is associative.

IfA = {a, b,c, d}, then|A X A| = 16. Consequently, there are 4!° functions f: A X A>
      EXAMPLE 5.33
                              A; that is, there are 4!° closed binary operations on A.
                                  To determine the number of commutative closed binary operations g on A, we realize
                              that there are four choices for each of the assignments g(a, a), g(b, b), g(c, c), and g(d, d).
                                                                                     5.4 Special Functions         269

We are then left with the 4* — 4 = 16 — 4 = 12 other ordered pairs (in A X A) of the form
                     (x, y), x # y. These 12 ordered pairs must be considered in sets of two in order to insure
                     commutativity. For example, we need g(a, b) = g(b, a) and may select any one of the four
                     elements of A for g(a, b). But then this choice must also be assigned to g(b, a). Therefore,
                     since there are four choices for each of these 12/2 = 6 sets of two ordered pairs, we find
                     that the number of commutative closed binary operations g on A is 4* - 4° = 4!°,

Definition 5.13   Let f: A X A — B bea binary operation on A. An element x € A is called an identity (or
                     identity element) for f if f(a, x) = f(x, a) =a, forallae A.

a) Consider the (closed) binary operation f: Z X Z— Z, where f(a, b) = a + b. Here
   EXAMPLE 5.34
                          the integer 0 is an identity since f(a,0) =a+0=0+4a = f(0, a) =a, for each
                          integer a.
                       b) We find that there is no identity for the function in part (a) of Example 5.29. For if f
                          had an identity x, then for any a € Z, f(a,x)     =as>a-—x=a>x                 =0.   But then
                          f(@, a) = f(0,a) =0~a         #a, unless a = 0.
                       c) Let A = {1, 2,3, 4,5, 6, 7}, and let g: A X A — A        be the (closed) binary operation
                          defined by g(a, b) = min{a, b}— that is, the minimum (or smaller) of a, b. This
                          binary operation is commutative and associative, and for any a € A we have g(a, 7) =
                          min{a, 7} = a = min{7, a} = g(7, a). So 71s an identity element for g.

In parts (a) and (c) of Example 5.34 we examined two (closed) binary operations, each
                     of which has an identity. Part (b) of that example showed that such an operation need not
                     have an identity element. Could a binary operation have more than one identity? We find
                     that the answer is no when we consider the following theorem.

THEOREM 5.4          Let f: A X A — B bea      binary operation. If f has an identity, then that identity is unique.
                     Proof: If f has more than one identity, let x;, x2 € A with

f(a, x,;)=a=     f(x,a),       forallae
                                                                               A,             and
                                         f(a, x2) =a=     fl(x2,a),     forallaeA.

Consider x; as an element of A and x2 as an identity. Then f(x), x2) = x,. Now reverse
                     the roles of x; and xz— that is, consider x2 as an element of A and x, as an identity. We
                     find that f(x), x2) = x2. Consequently, x; = x2, and f has at most one identity.

Now that we have settled the issue of the uniqueness of the identity element, let us see
                     how this type of element enters into one more enumeration problem.

If A = {x, a, b, c, d}, how many closed binary operations on A have x as the identity?
   EXAMPLE 5.35
                         Let f: AX A> A with f(, y) = y = f(y, x) for all y € A. Then we may represent
                     f by a table as in Table 5.2. Here the nine values, where x is the first component  — as in
                     (x, c), or the second component    — as in (d, x), are determined by the fact that x is the
                     identity element. Each of the 16 remaining (vacant) entries in Table 5.2 can be filled with
                     any one of the five elements in A.
270          Chapter 5 Relations and Functions

Table 5.2

&
                                                                   MNorsa&

STS

|

|
                                                                                             |

|
                                                                               Na
                                 Hence there are 5'° closed binary operations on A where x is the identity. Of these 5!° =
                                      2                     .                            .                                7             .
                              54. 5(4°—4)/2 are commutative. We also realize that there are 5'° closed binary operations
                              on A where b is the identity. So there are 5!7 = (7)5'6 = (9)5°-PO-" = (7)50-D" closed
                              binary operations on A that have an identity, and of these 5'! = (?)5!° = (3545                      9/2 are
                              commutative.

Having seen several examples of functions (in Examples 5.16(b), 5.29, 5.30, 5.32, 5.33,
                              5.34, and 5.35) where the domain is a cross product of sets, we now investigate functions
                              where the domain is a subset of a cross product.

Definition 5.14         For sets A and B, if DC    A X B, then 2,4: D —                    A, defined by z4(a, b) = a, is called the
                              projection on the first coordinate. The function 7g: D —                     B, defined by zg(a, b) = b, is
                              called the projection on the second coordinate.

We note that if D = A X B then zr, and zp are both onto.

EXAMPLE 5.36 |          If A = {w, x, y} andB = {1, 2, 3, 4}, let D = {(, 1), (x, 2), (x, 3), Gy, DD. Cy, 4}. Then
                              the projection 74: D > A satisfies m4(x, 1) = m4(x, 2) = wax, 3) = x, and ma4(y, 1) =
                              a(y, 4) = y. Since 14(D) = {x, y} C A, this function is not onto.
                                 For zg: D— B we find that wg(x, 1) = mg(y, 1) = 1, wax, 2) = 2, mex,                               3) = 3,
                              and zg(y, 4) = 4, so 7g(D) = B                 and this projection is an onto function.

Let A = B =R and consider the set D C A X B where D = {(x, y)|y = x7}. Then D
      EXAMPLE 5.37
                              represents the subset of the Euclidean plane that contains the points on the parabola y = x?.
                                 Among the infinite number of points in D we find the point (3, 9). Here 2,4 (3, 9) = 3,
                              the x-coordinate of (3, 9), whereas 73(3, 9) = 9, the y-coordinate of the point.
                                  For this example, 74(D)       = R = A,sozrq is onto. (The projection               4 is also one-to-one.)
                              However, 73(D)     = [0, +co) C R, so 7g is not onto. [Nor is it one-to-one
                                                                                                      — for example,
                              mp(2, 4) = 4 = mp(-2, 4).]

We now extend the notion of projection as follows. Let A;, A2,..., A, be sets, and
                              {ij, fo,..., im} EC {1, 2,...,a} with i) <in9 <--+- <i, and m<n. If DCA; X AX
                              ‘++ X A, =X¥_,      A;, then the function 2: D— Aj, X A;, X--* X Aj, defined by
                              ™(Q1, 42,..., Gn) = (4,,, 4,        .--, G,,) 18 the projection of D on the ith, isth, ..., i,th
                              coordinates. The elements of D are called (ordered) n-tuples; an element in 2(D) is an
                              (ordered) m-tuple.
                                                                                                 5.4 Special Functions           271

These projections arise in a natural way in the study of relational data bases, a standard
               technique for organizing and describing large quantities of data by modern large-scale
               computing systems. In situations like credit card transactions, not only must existing data
               be organized but new data must be inserted, as when credit cards are processed for new
               cardholders. When bills on existing accounts are paid, or when new purchases are made on
               these accounts, data must be updated. Another example arises when records are searched
               for special considerations, as when a college admissions office searches educational records
               seeking, for its mailing lists, high school students who have demonstrated certain levels of
               mathematical achievement.
                   The following example demonstrates the use of projections in a method for organizing
               and describing data on a somewhat smaller scale.

At a certain university the following sets are related for purposes of registration:
EXAMPLE 5.38
                             A,   = the set of course numbers for courses offered in mathematics.
                             A» = the set of course titles offered in mathematics.
                             A3 = the set of mathematics faculty.
                             Ag = the set of letters of the alphabet.

Consider the table, or relation,’ D C Ay X A> X A3 X Ag given in Table 5.3.

Table 5.3

Course Number              Course Title              Professor            Section Letter

MA     111            Calculus I              P. Z. Chinn                   A
                                  MA 111                Calculus I              V. Larney                     B
                                  MA 112                Calculus II             J. Kinney                     A
                                  MA     112            Calculus II             A. Schmidt                    B
                                  MA     112            Calculus II             R. Mines                      C
                                  MA 113                Calculus II             J. Kinney                     A

The sets Ay, Az, A3, Aq are called the domains of the relational data base, and table D
               is said to have degree 4. Each element of D is often called a list.
                   The projection of Don A; X A3 X Aq is shown in Table 5.4. Table 5.5 shows the results
               for the projection of D on A; X Ap.

Table 5.4                                                                 Table 5.5

Course Number            Professor         Section Letter                     Course Number        Course Title

MA    111           P. Z. Chinn                A                             MA    111         Calculus I
                     MA    111           V. Larney                  B                             MA    112         Calculus II
                     MA 112              J. Kinney                  A                             MA 113            Calculus III
                     MA    112           A. Schmidt                 B
                     MA    112           R. Mines                   C
                     MA 113              J. Kinney                  A

"Here the relation D   is not binary. In fact, D is a quaternary relation.
272           Chapter 5 Relations and Functions

Tables 5.4 and 5.5 are another way of representing the same                data that appear in
                                   Table 5.3. Given Tables 5.4 and 5.5, one can recapture Table 5.3.

The theory of relational] data bases is concerned with representing data in different ways
                                   and with the operations, such as projections, needed for such representations. The computer
                                   implementation of such techniques 1s also considered. More on this topic is mentioned in
                                   the exercises and chapter references.

8. Let A = {2, 4, 8, 16, 32}, and consider the closed binary
                          EXERCISES 5.4                               operation f: A X A — A where f(a, b) = gcd(a, b). Does f
                                                                      have an identity element?
1. For A = {a, b, c}, let f: A X A —       A be the closed binary
                                                                       9. For distinct primes p,q let A= {p"q"|O<m <3],
operation given in Table 5.6. Give an example to show that f
                                                                      0 <n < 37}. (a) What is |A|? (b) If f: AX A> A          is the
is not associative.
                                                                      closed binary operation defined by f(a, b) = gced(a, b), does
                       Table 5.6                                      f have an identity element?

fla          be                              10. State a result that generalizes the ideas presented in the
                                                                      previous two exercises.
                         aitboaeoe                                    11. For 6 # A CZ", let f.g: A X A—           A be the closed bi-
                         bia   cb                                     nary Operations defined by f(a, b) = min{a, b} and g(a, b) =
                         ctc          boa                             max{a, b}. Does f have an identity element? Does ¢?
                                                                      12. Let A = B = R. Determine z,4(D) and 23 (D) for each of
  2. Let f: R X R — Z be the closed binary operation defined          the following sets DC A X B.
by f(a, b) = [a + Bb]. (a) Is f commutative? (b) Is f associa-
tive? (c) Does f have an identity element?
                                                                          a) D= {(x, y)|x = y?}
3. Each of the following functions f: Z x Z— Zis aclosed
                                                                          b) D = {(%, y)ly = sin x}
binary operation on Z. Determine in each case whether f is                c) D={(x, yx? +y? = 1}
commutative and/or associative.                                       13. Let A,. 1 <i <5, be the domains for a table DC A; X
      a) f(x,y) =x+y—xy                                               Az X A3 X Ag X As, where A; = {U, V, W, X, Y. Z} (used as
                                                                      code names for different cereals in atest), and Ay = A3 = Ag =
      b) f(x, y) = max{x, y}, the maximum (or larger) of x, y
                                                                      As = Z*. The table D is given as Table 5.7.
      c) f(x,y) = x*                                                      a) What is the degree of the table?
      d) f®%, y)=x+y—-3
                                                                          b) Find the projection of D on A3 X Aq X As.
  4, Which of the closed binary operations in Exercise 3 have
                                                                           c) Adomain of a table is called a primary key for the table
an identity?
                                                                          if its value uniquely identifies each list of D. Determine the
  5. Let |A| =5. (a) What is {A X Al? (b) How many                        primary key(s) for this table.
functions f: A X A — A are there? (c) How many closed bi-
                                                                      14. Let A,, 1 <2 <5, be the domains for a table D C A, X
nary operations are there on A? (d) How many of these closed
                                                                      Az X A3 X Aq X As, where A, = {1, 2} (used to identify the
binary operations are commutative?
                                                                      daily vitamin capsule produced by two pharmaceutical compa-
6. Let A = {x, a, b. c, d}.                                          nies), Az = {A, D, E}, and A3 = Ay = As = Z*. The table D
      a) How many      closed binary operations    f on A satisfy     is given as Table 5.8.
      f(a, b)=c?                                                          a) What is the degree of the table?
      b) How many of the functions f in part (a) have x as an             b) What is the projection of D        on A, X Az? on A; X
      identity?                                                           Ag X As?
      c) How many of the functions f in part (a) have an iden-            c) This table has no primary key. (See Exercise 13.) We
      tity?                                                               can, however, define a composite primary key as the cross
      d) How many of the functions f in part (c) are commuta-             product of a minimal number of domains of the table, whose
      tive?                                                               components, taken collectively, uniquely identify each list
7, Let f:Z* X Z* > Z* be the closed binary operation de-                 of D. Determine some composite primary keys for this
fined by f(a, b) = gcd(a, b). (a) Is f commutative? (b) Is f              table.
associative? (c) Does f have an identity element?
                                                                              5.5 The Pigeonhole Principle           273

Table 5.7

Grams of        % of RDA‘ of     % of RDAof |            % of RDA of
                 Code Name           Sugar per       Vitamin A per | Vitamin C per |           Protein per
                  of Cereal        1-oz Serving      1-oz Serving        1-oz Serving         1-oz Serving

U                        1        25                  25                    6
                         Vv                      7        25                    2                   4
                        WwW                    12         25                    2                   4
                        xX                      0         60                 40                    20
                        Y                       3         25                 40                    10
                        Z                       2         25                 40                    10
             “RDA = recommended daily allowance

Table 5.8

Vitamin | Vitamin Present | Amount of Vitamin                      Dosage:        No. of Capsules
                 Capsule          in Capsule         in Capsule in IU’       Capsules / Day             per Bottle

1                  A                  10,000                       ]                   100
                    1                  D                     400                       ]                   100
                    1                  E                       30                      1                   100
                    2                  A                   4,000                       1                   250
                    2                  D                     400                       ]                   250
                    2                      E                   15                      I                   250
            “TU    = international units

5.5
The Pigeonhole Principle
            A change of pace is in order as we introduce an interesting distribution principle. This
            principle may seem to have nothing in common with what we have been doing so far, but
            it will prove to be helpful nonetheless.
                In mathematics one sometimes finds that an almost obvious idea, when applied in a
            rather subtle manner, is the key needed to solve a troublesome problem. On the list of such
            obvious ideas many would undoubtedly place the following rule, known as the pigeonhole
            principle.

The Pigeonhole Principle: If m pigeons occupy n pigeonholes and m > n, then at
                 least one pigeonhole has two or more pigeons roosting in it.

One situation for 6 (= m) pigeons and 4 (= n) pigeonholes (actually birdhouses) is shown
            in Fig. 5.7. The general result readily follows by the method of proof by contradiction. If
            the result is not true, then each pigeonhole has at most one pigeon roosting in it—for a
            total of at most n (< m) pigeons. (Somewhere we have lost at least m — n pigeons!)
                But now what can pigeons roosting in pigeonholes have to do with mathematics—
            discrete, combinatorial, or otherwise? Actually, this principle can be applied in various
            problems in which we seek to establish whether a certain situation can actually occur. We
274         Chapter 5 Relations and Functions

Figure 5.7

illustrate this principle in the following examples and shall find it useful in Section 5.6 and
                             at other points in the text.

An office employs 13 file clerks, so at least two of them must have birthdays during the
      EXAMPLE 5.39
                             same month. Here we have 13 pigeons (the file clerks) and 12 pigeonholes (the months of
                             the year).

Here is a second rather immediate application of our principle.

Larry returns from the laundromat with 12 pairs of socks (each pair a different color) in a
      EXAMPLE 5.40
                             laundry bag. Drawing the socks from the bag randomly, he’ll have to draw at most 13 of
                             them to get a matched pair.

From this point on, application of the pigeonhole principle may be more subtle.

Wilma operates a computer with a magnetic tape drive. One day she is given a tape that
      EXAMPLE 5.41
                             contains 500,000 “words” of four or fewer lowercase letters. (Consecutive words on the
                             tape are separated by a blank character.) Can it be that the 500,000 words are all distinct?
                                 From the rules of sum and product, the total number of different possible words, using
                             four or fewer letters, is

267 + 26° + 26° + 26 = 475,254.
                             With these 475,254 words as the pigeonholes, and the 500,000 words on the tape as the
                             pigeons, it follows that at least one word is repeated on the tape.

Let S Cc Zt, where |S| = 37. Then S contains two elements that have the same remainder
      EXAMPLE 5.42
                             upon division by 36.
                                 Here the pigeons are the 37 positive integers in $. We know from the division algorithm
                             (of Theorem 4.5) that when any positive integer n is divided by 36, there exists a unique
                             quotient g and unique remainder r, where

n = 36g +r,        O<r
                                                                                  < 36.

The 36 possible values of r constitute the pigeonholes, and the result is now established by
                             the pigeonhole principle.
                                                                            5.5 The Pigeonhole Principle          275

Prove that if 101 integers are selected from the set S = {1, 2, 3, ..., 200}, then there are
EXAMPLE 5.43
               two integers such that one divides the other.
                  For each x € S, we may write x = 2°y, with k>0, and ged(2, y) = 1. (This result
               follows from the Fundamental Theorem of Arithmetic.) Then y must be odd, so y é€
               T = {1,3,5,..., 199}, where |7| = 100. Since 101 integers are selected from S$, by the
               pigeonhole principle there are two distinct integers of the form a = 2”y, b = 2” y for
               some (the same) y € 7. Ifm <n, then a|b; otherwise, we have m > n and then bla.

Any subset of size 6 from the set S = {1, 2, 3, ..., 9} must contain two elements whose
EXAMPLE 5.44
               sum is 10.
                  Here the pigeons        constitute a six-element   subset of {1, 2, 3,...,9},     and the pigeon-
               holes    are the subsets    {1, 9}, {2, 8}, {3, 7}, {4, 6}, {5}. When   the six pigeons     go to their
               respective pigeonholes, they must fill at least one of the two-element subsets whose members
               sum to 10.

Triangle ACE is equilateral with AC = 1. If five points are selected from the interior of
EXAMPLE 5.45
               the triangle, there are at least two whose distance apart is less than 1/2.
                   For the triangle in Fig. 5.8, the four smaller triangles are congruent equilateral triangles
               and AB = 1/2. We break up the interior of triangle AC E into the following four regions,
               which are mutually disjoint in pairs:

Figure 5.8

R,:     the interior of triangle BC D together with the points on the segment B D, excluding
                          B and D.
                  R>:     the interior of triangle ABF.
                  R3:     the interior of triangle BDF together with the points             on the segments        BF
                          and DF, excluding B, D, and F.
                  R4:     the interior of triangle FDE.

Now we apply the pigeonhole principle. Five points in the interior of triangle AC E must
               be such that at least two of them are in one of the four regions R;, 1 <i        < 4, where any two
               points are separated by a distance less than 1/2.

Let S be a set of six positive integers whose maximum ts at most 14. Show that the sums
EXAMPLE 5.46
               of the elements in all the nonempty subsets of S$ cannot all be distinct.
                  For each nonempty         subset A of S, the sum of the elements in A, denoted s4, satisfies
               L<s,<9+10+---+14                   = 69, and there are 2° — 1 = 63 nonempty         subsets of S. We
276         Chapter 5 Relations and Functions

should like to draw the conclusion from the pigeonhole principle by letting the possible
                             sums, from 1 to 69, be the pigeonholes, with the 63 nonempty subsets of S$ as the pigeons,
                             but then we have too few pigeons.
                                So instead of considering all nonempty subsets of S, we cut back to those nonempty
                             subsets A of S where     |A| <5. Then for each such subset A it follows that 1 <s4         < 10+
                             11+ .---+ 14 = 60. There are 62 nonempty subsets A of S with |A| <5 —namely, all the
                             subsets of S except for 4 and the set S itself. With 62 pigeons (the nonempty subsets A of
                             S where |A| < 5) and 60 pigeonholes (the possible sums 54), it follows by the pigeonhole
                             principle that the elements of at least two of these 62 subsets must yield the same sum.

Let m € Z* with m odd. Prove that there exists a positive integer n such that m divides
i EXAMPLE 5.47               2" — 1.
                                 Consider the m+ 1 positive integers 2'-— 1,27-—1,23~1,...,2”~-—1,2"*!-1.
                             By the pigeonhole principle and the division algorithm there exist s,t¢ Z* with 1<
                             s<t<m+1, where 2° —1 and 2' — 1 have the same remainder upon division by m.
                             Hence 25 — 1 = gym + rand2‘ — 1 = qam +r, for gq), g2 € N, and (2' — 1) — (22-1) =
                             (gam +r) — (qym +r),so2' — 2° = (g2 — qi)m. But2' — 2° = 2°(2'~* — 1), and sincem
                             is odd, we have gcd(2*, m) = 1. Hence m|(2'~*          — 1), and the result follows with n =f —s.

While on a four-week vacation, Herbert will play at least one set of tennis each day, but he
      EXAMPLE 5.48
                             won't play more than 40 sets total during this time. Prove that no matter how he distributes
                             his sets during the four weeks, there is a span of consecutive days during which he will play
                             exactly 15 sets.
                                 For 1 <i < 28, let x, be the total number of sets Herbert will play from the start of
                             the vacation to the end of the ith day. Then        1 < x; < x2 <--+   < x23 < 40, and x, + 15 <
                             +++ << Xog + 15 <55.     We     now   have   the 28 distinct numbers   x;, x2,..., Xog and the 28
                             distinct numbers x; + 15, x. + 15, ..., x2g + 15. These 56 numbers can take on only 55
                             different values, so at least two of them must be equal, and we conclude that there exist
                             1<j <i < 28 with x, = x; + 15. Hence, from the start of day j + 1 to the end of day /,
                             Herbert will play exactly 15 sets of tennis.

Our last example for this section deals with a classic result that was first discovered in
                              1935 by Paul Erdos and George Szekeres.

Let us start by considering two particular examples:
      EXAMPLE 5.49
                                 1) Note how the sequence 6, 5, 8, 3, 7 (of length 5) contains the decreasing subsequence
                                    6, 5, 3 (of length 3).
                                 2) Now note how the sequence 11, 8, 7, 1, 9, 6, 5, 10, 3, 12 (of length       10) contains the
                                    increasing subsequence 8, 9, 10, 12 (of length 4).

These two instances demonstrate the general result: For each n € Z*, a sequence of n? + |
                             distinct real numbers contains a decreasing or increasing subsequence of length n + 1.
                                 To verify this claim let a), az, ... , @,24, be a sequence of n? + 1 distinct real numbers.
                             For 1<k <n? +1, let

xX; = the maximum length of a decreasing subsequence that ends with a;,, and
                                 yg = the maximum length of an increasing subsequence that ends with ax.
                                                                                                                  5.5 The Pigeonhole Principle                      277

For instance, our second particular example would provide

kK      |      1        2   3          4         5        6          7           8       9          10
                                                 ak              1         8   7          1         9        6          5          10       3 =        12
                                                  Xk              1        2   3          4         2        4          5           2       6           1
                                                  Yk              1        1   1          1         2        2          2           3       2          4
                                  If, in general, there is no decreasing or increasing subsequence of length n + 1, then 1 <
                                  xx <nand 1 < y <n forall 1 <k <n* + 1. Consequently, there are at most n? distinct
                                  ordered pairs (xz, yz). But we have n? + 1 ordered pairs (x;, yg), since 1 < k <n?+1.So0
                                  the pigeonhole principle implies that there are two identical ordered pairs (x;, yi), (xj, yj),
                                  wherei # j —sayi < j. Now the real numbers q@, a2, ... , @,24 are distinct, so ifa; < aj
                                  then y; < y;, while if a; <a; then x; > x;. In either case we no longer have (x;, yi) =
                                   (x;, yj). This contradiction tells us that x, =n +1                            or y, =n+4+1forsomen+1<k<
                                  n* +1; the result then follows.
                                     For an interesting application of this result, consider n* + 1 sumo wrestlers facing for-
                                  ward and standing shoulder to shoulder. (Here no two wrestlers have the same weight.) We
                                  can select n + 1 of these wrestlers to take one step forward so that, as they are scanned from
                                  left to right, their successive weights either decrease or increase.

b) Let SC Z* X Z*. Find the minimal value of |5|
                        EXERCISES 5.5                                                that guarantees the existence of distinct ordered pairs
                                                                                     (x1, X2), (V1, y2) € S such thatx, + y, and x2 + y2 are both
  1. In Example 5.40, what plays the roles of the pigeons and
                                                                                     even.
of the pigeonholes?
                                                                                     c) Extending the ideas in parts (a) and (b), consider § C
2. Show that if eight people are in a room, at least two of them
                                                                                     Z* XZ X Z*. What size must |5| be to guarantee the ex-
have birthdays that occur on the same day of the week.
                                                                                      istence
                                                                                        of distinct ordered                      triples (x;, x2, x3), (1, ¥2. ¥3) €
  3. An auditorium has a seating capacity of 800. How many                            S where x, + yi, X2 + y2, and x3 + y3 are all even?
seats must be occupied to guarantee that at least two people
                                                                                     d)       Generalize the results of parts (a), (b), and (c).
seated in the auditorium have the same first and last initials?
                                                                                      e) A        point     P(x, y)         in   the    Cartesian      plane   is called
  4, Let S = (3,7, 11, 15, 19,..., 95, 99, 103}. How many
                                                                                     a        lattice     point    if       x, yé€Z.       Given       distinct   lattice
elements must we select from S$ to insure that there will be
                                                                                     points        P, (x1, y1), Po(x2, y2),...,                 P,,(Xn, Yn),   determine
at least two whose sum is 110?
                                                                                     the smallest value of n that guarantees the existence of
5. a) Prove that if 151       integers are selected from         {1, 2, 3,          P.(x,, ¥,), P(x,, y,), l<i<j<n, such that the mid-
    ..., 300}, then the selection must include two integers
                                                          x, y                       point of the line segment connecting P,(x,, y,) and
    where x|y or y|x.                                                                P,(x;, y,) 1s also a lattice point.
    b) Write a statement that generalizes the results of part (a)                  9. a) If 1] integers are selected from {1, 2,3,..., 100},
    and Example 5.43.                                                                 prove that there are at least two, say x and y, such that
6. Prove that if we select 101 integers from the set S =                             O0<|/x — /y| <1.
{1,2,3,..., 200},    there exist m,n         in the selection          where         b) Write a statement that generalizes the result of part (a).
gecd(m, n) = 1.
                                                                               10. Let triangle ABC be equilateral, with AB = 1. Show that
7. a) Show that if any 14 integers are selected from the set                  if we select 10 points in the interior of this triangle, there must
    S = {1,2,3,...,     25},    there   are at least two       whose    sum    be at least two whose distance apart is less than 1/3.
    is 26.
                                                                               11. Let ABCD be a square with AB = 1. Show that if we se-
    b) Write a statement that generalizes the results of part (a)              lect five points in the interior of this square, there are at least
    and Example 5.44.                                                          two whose distance apart is less than 1//2.
8. a) If SC Z* and |S| > 3, prove that there exist distinct                   12. Let AC {1, 2, 3,..., 25} where |A| = 9. For any subset
    x, y € S where x + y is even.                                              B of A let sg denote the sum of the elements in B. Prove that
278            Chapter 5 Relations and Functions

there are distinct subsets C, D of A such that |C| = |Dj| =5              19. For k,n €Z*, prove that if kn +1 pigeons occupy n
and Sc = Sp.                                                              pigeonholes, then at least one pigeonhole has k + 1 or more
13. Let S be a set of five positive integers the maximum          of      pigeons roosting in it.
which is at most 9, Prove that the sums of the elements in all            20. How many times must we roll a single die in order to get
the nonempty subsets of S cannot all be distinct.                         the same score (a) at least twice? (b) at least three times? (c) at
14, During the first six weeks of his senior year in college,             least n times, for n > 4?
Brace sends out at least one resumé each day but no more than             21. a) Let Sc Z*. What is the smallest value for |S| that guar-
60 resumés in total. Show that there is a period of consecutive               antees the existence of two elements x, y € S where x and
days during which he sends out exactly 23 resumés.                            y have the same remainder upon division by 1000?
15. Let Sc Z* with |S| = 7. For@ # AC S, let s4 denote the                     b) What is the smallest value of n such that whenever § C
sum of the elements in A. If m is the maximum element in S,                    Z* and |S| = n, then there exist three elements x, y, z € §
find the possible values of m so that there will exist distinct                where all three have the same remainder upon division by
subsets B, C of S with sg = Sc.                                                1000?
16, Let &k € Z*. Prove that there exists a positive integer n such             c) Write a statement that generalizes the results of parts (a)
that k|n and the only digits in n are 0’s and 3’s.                             and (b) and Example 5.42.
17. a) Find a sequence of four distinct real numbers with no              22. For m,n € Z*, prove that if m pigeons occupy n pigeon-
    decreasing or increasing subsequence of length 3.                     holes, then at least one pigeonhole has | (m — 1)/n] + 1 ormore
      b) Find a sequence of nine distinct real numbers with no            pigeons roosting in it.
      decreasing or increasing subsequence of length 4.                   23. Let pi, po,..-, Pn € Z*. Prove that if py + po+---+
      c) Generalize the results in parts (a) and (b).                     Pn — "+ 1 pigeons occupy # pigeonholes, then either the first
                                                                          pigeonhole has p, or more pigeons roosting in it, or the second
      d) What do the preceding parts of this exercise tell us about
                                                                          pigeonhole has p2 or more pigeons roosting in it, ..., or the
      Example 5.49?
                                                                          nth pigeonhole has p, or more pigeons roosting in it.
18, The 50 members of Nardine’s aerobics class line up to get
                                                                          24. Given 8 Perl books, 17 Visual BASIC’ books, 6 Java books,
their equipment. Assuming that no two of these people have the
                                                                          12 SQL books, and 20 C++ books, how many of these books
same height, show that eight of them (as the line is equipped
                                                                          must we select to insure that we have 10 books dealing with the
from first to last) have successive heights that either decrease
                                                                          same computer language?
or increase.

5.6
                 Function Composition
                 and Inverse Functions
                                 When computing with the elements of Z, we find that the (closed binary) operation of
                                 addition provides a method for combining two integers, say a and b, into a third integer,
                                 namely a + b. Furthermore, for each integer c there is a second integer d where c + d =
                                 d+c=0,and we call d the additive inverse of c. (It is also true that c is the additive inverse
                                of d.)
                                     Turning to the elements of R and the (closed binary) operation of multiplication, we
                                 have a method for combining any r, s € R into their product rs. And here, for each t € R,
                                 if ¢ # 0, then there is a real number uw such that ut = tu = 1. The real number uw is called
                                 the multiplicative inverse of t. (The real number f¢ is also the multiplicative inverse of u.)
                                    In this section we first study a method for combining two functions into a single function.
                                 Then we develop the concept of the inverse (of a function) for functions with certain
                                 properties. To accomplish these objectives, we need the following preliminary ideas.

‘Visual BASIC is a trademark of the Microsoft Corporation.
                                                                   5.6 Function Composition and Inverse Functions    279

Having examined functions that are one-to-one and those that are onto, we turn now to
                  functions with both of these properties.

Definition 5.15   If f: A > B, then f is said to be bijective, or to be a one-to-one correspondence, if f is
                  both one-to-one and onto.

IfA = {1, 2, 3, 4} and B = {w, x, y, z}, thenf = {(1, w), (2, x), G, y), (4, z)} isa one-
EXAMPLE 5.50
                  to-one correspondence from A (on)to B, and g = {(w, 1), (x, 2), (y, 3), (z, 4)} is a one-
                  to-one correspondence from B (on)to A.

It should be pointed out that whenever the term correspondence was used in Chapter 1
                  and in Examples 3.11 and 4.12, the adjective one-to-one was implied though never stated.
                     For any nonempty set A there is always a very simple but important one-to-one corre-
                  spondence, as seen in the following definition.

Definition 5.16   The function 14: A — A, defined by 14(a) = a for alla € A, is called the identity function
                  for A.

Definition 5.17   If f, g: A—      B, we say that f and g are equal and write f = g, if f(a) = g(a) for all
                  aca.

A common pitfall in dealing with the equality of functions occurs when f and g are
                  functions with a common domain A and f(a) = g(a) for all a € A. It may not be the case
                  that f = g. The pitfall results from not paying attention to the codomains of the functions.

Let f: Z—       Z, g:Z—         Qwhere f(x) = x = g(x), forallx € Z. Then f, g share the com-
EXAMPLE 5.51
                  mon    domain    Z, have the same         range    Z, and act the same      on every element of Z. Yet
                  Ff # g! Here f is a one-to-one correspondence, whereas g is one-to-one but not onto; so
                  the codomains do make a           difference.

Consider the functions f, g: R > Z defined as follows:
EXAMPLE 5.52
                                            Xx,             ifxeZ
                                  f(x) =          nj +1.    ifxeR—-Z                                  eR
                                                                                    g(x) = [x], forallx

Ifx € Z, then f(x) = x = [x] = g(x).
                     Forx € R—       Z, write x =n         +r     where n   € Z and 0 <r    < 1. (For example, ifx = 2.3,
                  we write 2.3 = 24 0.3, withn             = 2 andr     = 0.3; for x = —7.3 we have —7.3 = —8 + 0.7,
                  with n = —8 andr = 0.7.) Then

f(x)= [xf] +]=at+1             = [x] = ge).
                      Consequently, even though the functions f, g are defined by different formulas, we
                  realize that they are the same function — because they have the same domain and codomain
                  and f(x) = g(x) for all x in the domain R.
280          Chapter 5 Relations and Functions

Now that we have dispensed with the necessary preliminaries, it is time to examine an
                              operation for combining two appropriate functions.

Definition 5.18         If f: A— B and g: B > C, we define the composite function, which is denoted
                              gof:A—>C,by (go f)(a) = g(f(a)), for eacha € A.

Let A = {1, 2, 3, 4}, B = {a, b, c}, and C = {w, x, y, z} with f: A>                    Band   g: BoC
      EXAMPLE 5.53
                              given by f = {(1, a), (2, a), (3, 5), (4, c)} and g = {(a, x), (B, y), (c, z)}. For each ele-
                              ment of A we find:

(go f)C1) = g(f()) = gla) = x               (go f)(3) = g(f(3))
                                                                                                      = gh) = y
                                           (go f)(2) = g(f(2)) = g(a) = x              (go f)(4) = g(f(4) = gle) =z
                              So

gof ={(, x), (2, x), 3, y), (4, z)}.
                              Note: The composition f o g is not defined.

Let f: R > R, g:R > R be defined by f(x) = x”, g(x) =x +5. Then
      EXAMPLE 5.54
                                                               (go f)(x) = g(f (x) = g@*) = x7 +5,
                              whereas

(f og)(x) = f(g(x)) = f(x +5) =                 xv +5)? = x7 4+ 10x 425.
                                 Here go f: R— Rand     f og:R—R, but (go f)(1) = 6 ¥ 36 = (f 0 g)(1),          so even
                              though both composites f o g and go f can be formed, we do not have fog = go f.
                              Consequently, the composition of functions is not, in general, a commutative operation.

The definition and examples for composite functions required that the codomain of f =
                              domain of g. Ifrange of f C domain of g, this will actually be enough to yield the composite
                              function go f: A— C. Also, for any f: A > B, we observe that fol, = f = 1lgof.

An important recurring idea in mathematics is the investigation of whether combining
                              two entities with a common property yields a result with this property. For example, if A
                              and B are finite sets, then A % B and A U B are also finite. However, for infinite sets A and
                              B, we have A U B infinite but A M B could be finite.
                                    For the composition of functions we have the following result.

THEOREM 5.5                   Let f: A>       Bandg: BOC.

a) If f and g are one-to-one, then g o f is one-to-one.
                                   b) If f and g are onto, then g o f is onto.
                              Proof:
                                   a) To   prove     that   go f: A—C      is   one-to-one,     let   aj,a,.¢€ A   with   (go f)(a1) =
                                       (go f){a2).      Then    (g 0 f)(ai) = (g 0 f) (a2) => B(f (ai) = gC f(a2)) = fla) =
                                       f (a2), because g is one-to-one. Also, f(a;) = f (a2) > a, = a, because f is one-
                                       to-one. Consequently, g o f is one-to-one.
                                                            5.6 Function Composition and Inverse Functions        281

b) For go f: A>      C, let z € C. Since g is onto, there exists y € B with g(y) = z. With
                         f onto and y € B, there exists x € A with f(x) = y. Hence z = g(y) = g(f(x)) =
                         (g o f){x), so the range ofg o f = C = the codomain of g o f, and g o f is onto.

Although function composition is not commutative, if f: A— B, g: BC,                  and h:
                    C — D, what can we say about the functions (fh o g) o f andh o (g o f)? Specifically, is
                    (hog)o f =ho(go f)? That is, is function composition associative?
                      Before considering the general result, let us first investigate a particular example.

Let f, g, 4: R-> R, where f(x) = x”, g(x) =x 45, and h(x) = Vx? 42.
   EXAMPLE 5.55_|      Then (( 0 g) o f)(x) = (ho g)(f(x)) = (ho g)(x*) = h(g(x’)) = AQ? +5) =
                    JV (x2 +5)2 42 = Sx4 + 10x? + 27.
                       On the other hand, we see that (ho (go f))({x) =h((g 0 f)(x)) = h(g(f(x))) =
                    h(g(x7)) = A(x? +5) = J (x2 +5)? +2 = /x44 10x? 4 27, as above.
                       So in this particular example, (h o g) o f and ho (go f) are two functions with the
                    same domain and codomain, and for all x € R, ((hog)o f)(x) = Vx44 10x? + 27 =
                    (ho (go f))(x). Consequently, (ho g)o f =ho(go        f).

We now find that the result in Example 5.55 is true in general.

THEOREM 5.6         If f: A>    B,g: B>C,andh:
                                             C >             D, then (hog)o
                                                                       f =ho(go f).
                    Proof: Since the two   functions   have the same    domain,   A, and codomain,       D, the result
                    will follow by showing that for every x € A, ((ho g)o f)(x) = (ho (go f)){x). See the
                    diagram shown in Fig. 5.9.)

(hog)ef

he(gof)
                         Figure 5.9

Using the definition of the composite function we know that for each x € A it takes two
                    steps to determine (g o f)(x). First we find f(x), the image of x under f. This is an element
                    of B. Then we apply the function g to the element f(x) to determine g(f(x)), the image
                    of f(x) under g. This results in an element of C. At this point we apply the function h
                    to the element g(f (x)) to determine h(g(f(x))) = h((g o f)(x)) = (ho (g 0 f))(x). This
                    result is an element of D. Similarly, starting once again with x in A, we have f(x) in B,
282           Chapter 5 Relations and Functions

and now we apply the composite function # o g to f(x). This gives us ({h o g) o f)(x) =
                               (ho g)(f(x)) = h(g(f(x))).
                                      Since ((h o g) o f)(x)    = h(g(f(x)))   = (ho (g o f)) (x), for each x in A, it now follows
                               that

(hog)of =ho(gof).
                               Consequently, the composition of functions is an associative operation.

By virtue of the associative property for function composition, we can write ho go f,
                               (hog)of or ho(go f) without any problem of ambiguity. In addition, this property
                               enables us to define powers of functions, where appropriate.

Definition 5.19          If f: A— Awe define f' = f,andforne
                                                              Zt, f"t! = f o(f”).

This definition is another example wherein the result is defined recursively. With f"*! =
                               f o(f"), we see the dependence of f"*! ona previous power, namely, f”.

WithA = {1, 2, 3, 4}and f: A >          A defined
                                                                              by f = {(1, 2), (2, 2), (3, 1), (4, 3)},
                                                                                                                    we have
      EXAMPLE 5.56
                               fe =fof ={((, 2), (2,2), 3,2), (4, D}                 and    fe =fof?=fofof ={i,2),
                               (2, 2), (3, 2), (4, 2)}. (What are f*, f°?)

We now come to the last new idea for this section: the existence of the invertible function
                               and some of its properties.

Definition 5.20          For sets A, B, if R is a relation from A to B, then the converse of R, denoted KR‘, is the
                               relation from B to A defined by R* = {(b, a)|(a, b) € R}.

To get R° from KR, we simply interchange the components of each ordered pair in
                               R. So if A = {1, 2,3, 4}, B = fw, x, y}, and R = {(1, w), (2, w), (3, x)}, then R° =
                               {(w, 1), (w, 2), (x, 3)}, a relation from B to A.
                                   Since a function is a relation we can also form the converse of a function. For the
                               same preceding sets A, B, let f: A— B where f = {(1, w), (2, x), (3, y), (4, x)}. Then
                               f° = {(@w, 1), &, 2), (y, 3), (x, 4}, a relation, but not a function, from B to A. We wish to
                               investigate when the converse of a function yields a function, but before getting too abstract
                               let us consider the following example.

For A = {1, 2,3} and B = {w, x, y}, let f: A— B be given by f = {(1, w), (2, x),
      EXAMPLE 5.57
                               (3, y)}. Then f° = {(w, 1), (x, 2), (y, 3)} is a function from B to A, and we observe that
                               f° of      =1, and fo f* =1,.

This finite example leads us to the following definition.

Definition 5.21          If f: A—       B, then f is said to be invertible if there is a function g: B — A such that
                               gof=        1, and   fog=       1p.
                                                             5.6   Function Composition and Inverse Functions             283

Note that the function g in Definition 5.21 is also invertible.

Let f, g: R—      R be defined by f(x) = 2x +5, g(x) = (1/2)(x — 5). Then (g 0 f)(x) =
   EXAMPLE 5.58
                  g(f(x)) = g(2x +5) = (1/2) [(2x + 5) — 5] =x, and (f 0 g)(x) = f(g(x)) =
                  F(A/2)(x — 5)) = 2[0/2)( —5)] +5 =x,s0 fog =lpandgo f = Ir.
                  Consequently, f and g are both invertible functions.

Having seen some examples of invertible functions, we now wish to show that the
                  function g of Definition 5.21 is unique. Then we shall find the means to identify an invertible
                  function.

THEOREM 5.7       If a function f: A > B is invertible and a function g: B > A satisfies go f = 1,4 and
                  f og = 1g, then this function g is unique.
                  Proof: If g is not unique, then there is another function h: B > A with ho f = 1, and
                  f oh=1 8. Consequently,      h =holp=ho(fog)=(ho f)og=140g8 =.

As a result of this theorem we shall call the function g the inverse of f and shall adopt
                  the notation g = f~!. Theorem 5.7 also implies that f~! = f°.
                     We also see that whenever         f is an invertible function, so is the function f~', and
                  (f-')7! = f, again by the uniqueness in Theorem 5.7. But we still do not know what
                  conditions on f insure that f is invertible.
                     Before stating our next theorem we note that the invertible functions of Examples 5.57
                  and 5.58 are all bijective. Consequently, these examples provide some motivation for the
                  following result.

THEOREM 5.8       A function f: A >      B is invertible if and only if it is one-to-one and onto.
                  Proof: Assuming that f: A — B         is invertible, we have a unique function g: B > A with
                  gof=la, fog = 1g. Itai, az € A with f(a1) = f(a), then g(f(a))) = g(f(a2)), or
                  (g o f)(a,) = (g o f) (az). With g o f = 1, it follows that a; = a2, so f is one-to-one. For
                  the onto property, let b € B. Then g(b) € A, so wecan talk about f(g(b)). Sincef og = 1p,
                  we have b = 1g(b) = (f o g)(b) = f(g(b)), so f is onto.
                      Conversely, suppose f: A — B is bijective. Since f is onto, for each b € B there is an
                  a € A with f(a) = b. Consequently, we define the function g: B > A by g(b) = a, where
                  f(a) = b. This definition yields a unique function. The only problem that could arise is if
                  g(b) = ay F ay = g(b) because f(a,) = b = f (a2). However, this situation cannot arise
                  because f is one-to-one. Our definition of g is such that g o f = 14 and f og = 1g, so we
                  find that f is invertible, with g = f7!.

From Theorem 5.8 it follows that the function /;:R — R defined by f(x) = x? is not
   EXAMPLE 5.59
                  invertible   (it is neither   one-to-one   nor   onto),   but   f2: [0, +00) >   [0, +c)      defined    by
                  f(x) = x? is invertible with fy)           = J/x.

The next result combines the ideas of function composition and inverse functions. The
                  proof is left to the reader.
284         Chapter 5 Relations and Functions

THEOREM 5.9                  If f: A—      B, g:B—C are              invertible functions, then go f: A—C    is invertible and
                             (gof)'=filog',

Having seen some examples of functions and their inverses, one might wonder whether
                             there is an algebraic method to determine the inverse of an invertible function. If the func-
                             tion is finite, we simply interchange the components of the given ordered pairs. But what if
                             the function is defined by a formula, as in Example 5.59? Fortunately, the algebraic manip-
                             ulations prove to be little more than a careful analysis of “interchanging the components of
                             the ordered pairs.” This is demonstrated in the following examples.

For m,b &€R, m # 0, the function f: R > R defined by f = {(x, y)|y = mx + 5} is an
      EXAMPLE 5.60           invertible function, because it is one-to-one and onto.
                                To get f—! we note that
                                                fo!      ={, y)ly = mx + db} = {(y, ly = mx +d}
                                                         = {(x, y)|x =nn my + b} = {@, y)ly = (1/m)(@ — b)}.
                                                          This is where we rename the variables
                                                          (replacing x by y and y by x) in order to
                                                          change the components of the ordered pairs of f.

So f:R > R            is defined by f(x) = mx +b, and f~':R => R is defined by f~!(x) =
                             (1/m)(x — b).

Let f:R— R* be defined by f(x) = e*, where e = 2.7183, the base for the natural
      EXAMPLE 5.61
                             logarithm. From the graph in Fig. 5.10 we see that f is one-to-one and onto, so f7!:
                             R* — Rdoes exist and f—' = {(x, y)|y = e*}* = {(x, y)|x = & } = {(x, yy = In x}, so
                             f-'(x) =Inx.

yA

Figure 5.10

We should note that what happens in Fig. 5.10 happens in general. That is, the graphs
                             of f and f~! are symmetric about the line y = x. For example, the line segment connect-
                             ing the points (1, e) and (e, 1) would be bisected by the line y = x. This is true for any
                             corresponding pair of points (x, f(x)) and (f(x), f7'(f(x))).
                                                           5.6 Function Composition and Inverse Functions        285

This example also yields the following formulas:

x = Ip(x) = (f7! 0 f)(x) = Ine"),            forallx ER.
                                       x =Ipi(x) =(fof-')@) =e",                    forallx > 0.

Even when a function f: A >     B is not invertible, we find use for the symbol f7! in the
                  following sense.

Definition 5.22   If f: A— Band B; CB, then f-!(B;) = {x € A| f(x) € By}. The set f~'(B)) is called
                  the preimage of B, under f.

Be careful! We are now using the symbol f~! in two different ways. Although we have
                  the concept of a preimage for any function, not every function has an inverse function.
                  Consequently, we cannot assume the existence of an inverse for a function f just because
                  we find the symbol f7! being used. A little caution is needed here.

Let A = {1, 2, 3, 4, 5, 6} and B = {6, 7, 8, 9, 10}. If f: A— B with f = (C1, 7), (2,7),
EXAMPLE 5.62      (3, 8), (4, 6), (5, 9), (6, 9)}, then the following results are obtained.

a) For B, = {6, 8} C B, we have f—!(B,) = {3, 4}, since f(3) = 8 and f(4) = 6, and
                       for anya € A, f(a) ¢ B, unlessa = 3 or a = 4. Here we also note that | f~'(B,)| =
                       2 = |B).
                    b) In the case of Bz = {7, 8} C B, since f(1) = f(2) = 7 and f (3) = 8, we find that the
                       preimage of B> under f is {1, 2, 3}. And here | f~!(B2)| = 3 > 2 = |Bo|.
                    c) Now consider the subset B3 = {8, 9} of B. For this case it follows that f~!(B3) =
                       {3, 5, 6} because f(3) = 8 and f(5) = f(6) = 9. Also we find that | f~'(B3)| = 3 >
                       2 = |Bs|.
                    d) If By = {8, 9, 10} C B, then with f (3) = 8 and f(5) = f(6) = 9, wehave f—!(B4) =
                       {3, 5, 6}. So fo!   (Bs) = f~'(B3)    even though By > B3. This result follows because
                       there is no elementa in the domain A where f(a) = 10—that is, f~'({10}) = @.
                    e) Finally, when    Bs; = {8, 10} we find that f~!(Bs) = {3} since f(3) = 8             and, as in
                       part (d), f~!({10}) = @. In this case | f~!(Bs)| = 1 <2 = |Bs|.

Whenever f: A — B, then for each b € B we shall write f~!(b) instead of f~!({b}).
                  For the function in Example 5.62, we find that

f(6) = (4)         fT) = (1,2)         FB) = 3}          1) = {5,6}            F110) = B.
EXAMPLE 5.63      Let f: R > R be defined by

|      3x5,           x>0O
                                                POO)        ead,             x <0.
                    a) Determine f(0), f(1), f(—l), f(5/3), and f(—5/3).
                    b) Find f~'(0), f-'(), f- 1-1), £71), f7'(-3), and f7!(-6).
                    c) What are f~'([—5, 5]) and f~'([—6, 5])?
286   Chapter 5 Relations and Functions

a) f(0) = -30)4+- 1=1                  Ff (5/3) = 3(5/3) —5 =0
                                 f() =3(1) -5 = -2                   f (—5/3)  = -—3(-5/3) +        1=6
                                 f(-1) = -3(-I) +1=4
                             b) f~'(0) = {x ER| f(x) € {0}} = {x ERI f(x) = 0}
                                          = {x € Rix > Oand3x —5 =O} U{x Ee RJIx <Oand — 3x    + 1 =0}
                                          = {x €E R|x > Oandx = 5/3}
                                                                   U {x Ee R[x < O andx = 1/3}
                                          = {5/3} UB = {5/3}
                                 [Note how the horizontal line y = 0 — that is, the x-axis — intersects the graph in
                                 Fig. 5.11 only at the point (5/3, 0).]

(10/3, 5)
                                                                                   (3, 4)

Figure 5.11

ff") = {x ERI f(a) € (1) = (x ERIF@) = 1
                                           = {x €R|x > Oand3x —5 = 1} U {x eR|x <Oand —-3x4+1=1}
                                           = {x € R[x > Oand x = 2} U {x ER|x <Oandx =0}
                                          = {2} U {0} = {0, 2}
                                 [Here we note how the dashed line y = | intersects the graph in Fig. 5.11 at the
                                 points (0, 1) and (2, 1).]
                               f—'(-1)= {x E R|x > Oand 3x —5 = —1} U {x ER) x <Oand                      —-3x4+1=-—-1}
                                          = {x € R|x > Oandx        = 4/3} U {x E R|x <Oandx        = 2/3}
                                          = {4/3} US = {4/3}

f-1(3) = {-2/3, 8/3}     f'(—3) = [2/3]
                              ic)         = {x € R|x > Oand 3x —5        = —6} U {x ER| x <Oand           — 3x41
                                                                                                               = —6}
                                          = {x €R|x > Oandx = —1/3}
                                                                  U {x Ee R|x <Qandx                  = 7/3}
                                          =SUB=4
                              ec) f-'([-5, 5]) = {xl f@) € [-5, 5]} = {x| —5
                                                                           < fe)            < 5}.
                                 (Case 1) x > 0:           —-5<3x —-5<5
                                                           0<3x    <10
                                                           0<x    < 10/3—so we use0 < x < 10/3.
                                                           5.6 Function Composition and Inverse Functions         287

(Case 2) x <Q:          —~5<-3x+1<5

-6<   —-3x <4
                                                     2>x   > —4/3—here       we use    —4/3 < x <0.
                             Hence f~'((—5,5]) = {x|-—4/3 <x <0 or 0 <x < 10/3} = [—4/3, 10/3].
                             Since there are no points (x, y) on the graph (in Fig. 5.11) where y < —5, it follows
                             from our prior calculations that f~'({[—6, 5]) = f7'({[—5, 5]) = [—4/3, 10/3].

EXAMPLE 5.64      a) Let f: Z—     Rbe defined by f(x) = x* + 5. Table 5.9 lists f~'(B) for various subsets
               °        B of the codomain R.
                     b) If g: R > R is defined by g(x) = x7 +5, the results in Table 5.10 show how a change
                        in domain (from Z to R) affects the preimages (in Table 5.9).

Table 5.9                                      Table 5.10

B                  f7(B)                       B                       g7'(B)

[6, 7]            {-1, 1}                      (6, 7]          [-V2, -1]U[1, v2]
                         [6, 10]       {-2, -1, 1, 2}                   [6, 10]         [-V5, -1JU[1, V5]
                         [-4, 5)               b                        [—4, 5)                     gy
                         [—4, 5]              {O}                       [—4, 5]                    {0}
                         [S, +00)              Z                        [5, +00)                    R

The concept of a preimage appears in conjunction with the set operations of intersec-
                   tion, union, and complementation in our next result. The reader should note the difference
                   between part (a) of this theorem and part (b) of Theorem 5.2.

THEOREM 5.10       If f: A —> Band By, By C B, then (a) f~'(B, 0 By) = f7'(Bi) Nf" (Bo);
                   (b) f~'(By U By) = f7'(Bi) U f7'(Bo); and (c) f-'(B)) = f-!(B)).
                   Proof: We prove part (b) and leave parts (a) and (c) for the reader.
                       ForaeA,ace f(BUBR)S fMEB                      URS        fla)e Bjor flaje             bo ae
                   f-'(B)) ora € f7'(B:)             ae fo'(B)) U f(D).

Using the notation of the preimage, we see that a function f: A > B is one-to-one if
                   and only if | f~!(b)| < 1 for each b € B.

Discrete mathematics is primarily concerned with finite sets, and the last result of this
                   section demonstrates how the property of finiteness can yield results that fail to be true in
                   general. In addition, it provides an application of the pigeonhole principle.

THEOREM 5.11       Let f: A — B for finite sets A and B, where |A| = | B|. Then the following statements are
                   equivalent: (a) f is one-to-one; (b) f is onto; and (c) f is invertible.
                   Proof: We have already shown in Theorem 5.8 that (c) =          (a) and (b), and that together (a),
                   (b) => (c). Consequently, this theorem will follow when we show that for these conditions
288             Chapter 5 Relations and Functions

on A, B, (a) <> (b). Assuming (b), if f is not one-to-one, then there are elements ay, a2 €
                                       A, with a; # a2, but f(a,) = f (a2). Then |A| > | f(A)| = | 8, contradicting |A| = |B].
                                       Conversely, if f is not onto, then | f(A)| < |B|. With |A] = |B] we have |A| > | f(A)|, and
                                       it follows from the pigeonhole principle that f is not one-to-one.

Using Theorem 5.11 we now verify the combinatorial identity introduced in Problem 6
                                       at the start of this chapter. For if n € Z* and |A| = |B| =n, there are n! one-to-one
                                       functions from A to B and )>;_9(—1)*(,",)(n — k)” onto functions from A to B. The
                                       equality n! = )°;_9(—1)*(,,",)( — k)” is then the numerical equivalent of parts (a) and
                                       (b) of Theorem 5.11. [This is also the reason why the diagonal elements S(n, n), 1 <n                    <8,
                                       shown in Table 5.1 all equal 1.)

9, a) Find the inverse of the function f: R > R?* defined by
                            EXERCISES 5.6                                       f ( x)   _   e2tts.

1. a) For A = (1, 2, 3, 4,..., 7}, how many bijective func-
                                                                                b) Show that f o f~! = Ip+ andf—!o f               = Ip.
      tions f: A > A      satisfy f(1) 4 1?                               10.   For each of the following functions f: R >             R, determine
                                                                          whether f is invertible, and, if so, determine f~'.
      b) Answer part (a) where A = {x|x € Z*,           1 < x <n}, for
      some fixed n € Z*.                                                        a) f = {(, y)|2x + 3y = 7}
2. a) For A = (—2, 7] C R define the functions                                 b) f = {(, y)lax + by =c, b #0}
      f,g: Az     Rby                                                           ce) f = {(x, yly =x°)
                                                      2x? —8
                f(x) =2x—-—4           and    g(x)=    a                        d) f ={@, yly = x7 +x}
                                                                          11. Prove Theorem 5.9.
      Verify that f = g.
      b) Is the result in part (a) affected if we         change   A to   12. If A= (1, 2,3, 4,5, 6, 7}, B = {2, 4, 6, 8, 10, 12}, and
      [—7, 2)?                                                            f:A—2B     where f = {(1, 2), (2, 6), G3, 6), (4, 8), (5, 6),
                                                                          (6, 8), (7, 12)}, determine the preimage             of B,    under   f in
  3. Let f, g: R> R, where g(x) = 1—x +x? and f(x) =
                                                                          each of the following cases.
ax +b. If (go f)(x) = 9x? — 9x + 3, determine a, b.
                                                                                a)   By = {2}                       b) B; = {6}
4. Letg: N — N         be defined by g(n) = 2n. IfA = {1, 2, 3, 4}
and f: A—       N is given by f = {(1, 2), (2, 3), (3, 5), , 7)},               c) B; = {6, 8}                      d) B; = {6, 8, 10}
find go f.                                                                      e) B, = {6, 8, 10, 12}              f) B, = (10, 12)
5. If U is a given universe with (fixed) S$, 7 CU, define                13. Let f: R > R be defined by
a POU) > APCU by e(A) = TA (SUA) for A CU. Prove
that g? = g.                                                                                                x+7,        x <0
  6. Let f, g: R— Rwhere f(x) = ax + band g(x) =cx+d                                         f(x)=%      —2x +5,        O<x    <3
for allx € R, witha, b, c, d real constants, What relationship(s)                                           x —1,       3<x
must be satisfied by a, b, c, dif (f o g)(x) = (g o f)(x) for all
x €R?
                                                                                a) Find f~'(—10), f-'(0), f-'(4), f-'(6), f- 1),                 and
7. Let f, g, 4: Z—        Zbe defined by f(x) =x - 1,                          f-'(8).
g(x) = 3x,                                                                      b) Determine the preimage under f for each of the inter-
                                0,           x even                             vals (i) [—5, —1}; (ii) [—5, 0]; Gi) [—2, 4]; Civ) (5, 10);
                       A(x) =                                                   and (v) [11, 17).
                                  1,         x odd.
                                                                          14. Let f: R— R             be defined by f(x) = x’. For each of the
Determine
    (a) fog, gof, goh, hog, fo(goh),
                                                                          following subsets B of R, find f~'(B).
(fog)oh; (b) f*, f°, 97, 8h, Wh.
                                                                                a) B = {0, 1}                       b) B = {-1,0, 1}
  8. Let f: A —> B, g:   B > C. Prove that (a)if go f:       A>C
is onto, then g is onto; and (b) if go f: A > C is one-to-one,                  c) B = [0, 1}                       d) B = (0,1)
then f is one-to-one.                                                           e) B = [0, 4]                       f) B= (0, 1) U (4, 9)
                                                                                                     5.7 Computational Complexity                 289

15. Let A = {1, 2,3, 4,5} and B = {6, 7, 8, 9, 10, 11, 12}.                     c) Is any one of the given functions invertible?
How many functions f: A — B are such that f~'({6, 7, 8}) =                      d) Are any of the following sets infinite?
{1, 2}?
                                                                                       (1) f-'@)                   (2) g 1D)
16. Let f: RR        be defined by f(x) = Lx], the greatest                            (3) h'(B)                   (4) fap
integer in x. Find f~!(B) for each of the following subsets B                          (5) g"C2)                   (6) A7'({3})
of R.                                                                                  (7) f-'4, 7)                (8) g '({8, 12})
    a) B = {0, 1}                   b) B = {-1,
                                             0, 1}                                     (9) A7'({5, 9})
    c) B =[0, })                    d) B = [0,2)                                e) Determine the number of elements in each of the finite
                                                                                sets in part (d).
    e) B =[-1, 2]                   f) B =[-1,0)
                                              Ud, 3]
                                                                           19. Prove parts (a) and (c) of Theorem 5.10.
17. Let f, g: Z* + Z* where for all x € Z*, f(x) =x41
and g(x) = max{1, x — 1}, the maximum of | and x — 1.                      20. a) Give an example of a function f: Z—               Z where (i) f is
                                                                                one-to-one but not onto; and (ii) f is onto but not one-to-
    a) What is the range of f?
                                                                                one.
   b) Is f an onto function?
                                                                                b) Do the examples in part (a) contradict Theorem 5.11?
    c) Is the function f one-to-one?
                                                                           21. Let f: Z—        N be defined by
   d) What is the range of g?
                                                                                                         2x — 1,       ifx >0
    e) Is g an onto function?                                                                f(x)=
                                                                                                         —2x,           forx < 0.
   f) Is the function g one-to-one?
                                                                                a) Prove that f is one-to-one and onto.
    g) Show thatgo f = lz+.
                                                                                b) Determine f~!.
   h) Determine (f o g)(x) forx = 2, 3, 4, 7, 12, and 25.
                                                                           22. If |A| =|B| =5,           how    many   functions    f: A—     B    are
    i) Do the answers for parts (b), (g), and (h) contradict the
                                                                           invertible?
   result in Theorem 5.8?
                                                                           23. Let f, g,4,k: NN       where f(n) = 3n, g(n) = [n/3],
18. Let f, g, h denote the following closed binary operations
                                                                           h(n) = ((n + 1)/3), and k(w) = [(n + 2)/3], for eachn EN.
on P(Z*). For A, BCZ*, f(A, B)=ANB, g(A, B)=
                                                                           (a) For each n EN what are (go f)(n), (Ao f)(n), and
AUB,h(A, B)=AAB.
                                                                           (k o f)(n)? (b) Do the results in part (a) contradict Theo-
    a) Are any of the functions one-to-one?                                rem 5.7?
   b) Are any of f, g, and / onto functions?

5.7
          Computational Complexity’
                                In Section 4.4 we introduced the concept of an algorithm, following the examples set forth
                                by the division algorithm (of Section 4.3) and the Euclidean algorithm (of Section 4.4). At
                                that time we were concerned with certain properties of a general algorithm:
                                   @ The precision of the individual step-by-step instructions
                                   e The input provided to the algorithm, and the output the algorithm then provides
                                   e The ability of the algorithm to solve a certain type of problem, not just specific instances
                                   of the problem
                                   e The uniqueness of the intermediate and final results, based on the input

"The material in Sections 5.7 and 5.8 may be skipped at this point. It will not be used very much until Chapter
                                10. The only place where this material appears before Chapter 10 is in Example 7.13, but that example can be
                                omitted without any loss of continuity.
290          Chapter 5 Relations and Functions

e The finite nature of the algorithm in that it terminates after the execution of a finite
                                 number of instructions

When an algorithm correctly solves a certain type of problem and satisfies these five
                              conditions, then we may find ourselves examining it further in the following ways.

1) Can we somehow measure how long it takes the algorithm to solve a problem of a
                                     certain size? Whether we can may very well depend, for example, on the compiler
                                     being used, so we want to develop a measure that doesn’t actually depend on such
                                     considerations as compilers, execution speeds, or other characteristics of a given
                                     computer.
                                         For example, if we want to compute a" for a €R and n €Z", is there some
                                     “function of 2” that can describe how fast a given algorithm for such exponentiation
                                     accomplishes this?
                                  2) Suppose we can answer questions such as the one set forth at the start of item 1. Then
                                     if we have two (or more) algorithms that solve a given problem, is there perhaps a
                                     way to determine whether one algorithm ts “better” than another?
                                  In particular, suppose we consider the problem of determining whether a certain real
                              number x is present in the list of n real numbers a), a2, ... , d,. Here we have a problem
                              of size n.
                                  If there is an algorithm that solves this problem, how long does it take to do so? To
                              measure this we seek a function f (n), called the time-complexity function’ of the algorithm.
                              We expect (both here and in general) that the value of f() will increase as         increases.
                              Also, our major concern in dealing with any algorithm is how the algorithm performs for
                              large values of n.
                                  In order to study what has now been described in a somewhat informal manner, we need
                              to introduce the following fundamental idea.

Definition 5.23         Let f, g: Zt > R. We say that g dominates f (or f is dominated                       by g) if there exist
                              constants m € R* and k € Z* such that | f(n)| < m|g(n)| for all n € Z*, where n > k.

Note that as we consider the values of f(1), g(1), f(2), g(2),..., there is a point
                              (namely, k) after which the size of f(n) is bounded above by a positive multiple (m) of
                              the size of g(n). Also, when g dominates f, then | f(n)/g(n)| < m [that is, the size of the
                              quotient f(#)/g(n) is bounded by m], for those n € Z* where n > k and g(n) # 0.
                                 When f is dominated by g we say that f is of order (at most) g and we use what is
                              called “big-Oh”’ notation to designate this. We write f € O(g), where O(g) is read “order
                              g” or “big-Oh of g.” As suggested by the notation “f € O(g),” O(g) represents the set of
                              all functions with domain Z* and codomain R that are dominated by g. These ideas are
                              demonstrated in the following examples.

Let f, g:Z* > R be given by f(n) = 5n, g(n) =n’, for n € Z*. If we compute f(n)
      EXAMPLE 5.65
                              and g(n)    for 1 <n     <4,    we   find that f(1) =5,        g(1) = 1; f(2) = 10, g2)=4            f@B)=

‘We could also study the space-complexity function of an algorithm, which we need when we attempt to
                              measure the amount of memory required for the execution of an algorithm on a problem of size n. In this text,
                              however, we limit our study to the time-complexity function.
                                                                         5.7 Computational Complexity          291

15, g(3) =9;       and   f(4) = 20, 2(4) = 16.    However,     n>5=3n?>5n,           and   we   have
               | f (2)| = 5n <n? = |g(n)|. So with m = 1 and k =5, we find that for n > k, | f()| <
               m|g(n)|. Consequently, g dominates f and f € O(g). [Note that | f()/2(n)| is bounded
               by | forall n > 5.]
                  We also realize that for all n € Z*, | f(n)| = 5n <5n? = 5|g(n)|. So the dominance off
               by g is shown here with k = 1 and m = 5. This is enough to demonstrate that the constants
               k and m of Definition 5.23 need nor be unique.
                  Furthermore, we can generalize this result if we now consider functions f,, ¢::Z* > R
               defined by f\(”) = an, gi(n) = bn*, where a, b are nonzero real numbers. For ifm €¢ Rt
               with m|b| > |a|, then for all n> 1(=k), |f,(@)| = |an| = |aln < m|b|n < m|b|n? =
               m|bn*| = m|g;(n)|, and so f; € O(g)).

In Example 5.65 we observed that f € O(g). Taking a second look at the functions f
               and g, we now want to show that g ¢ O(f).

Once again let f, g: Z* >      R be defined by f(n) = 5n, g(n) = n*, forn € Zt.
EXAMPLE 5.66
                  If g € O(f), then in terms of quantifiers, we would have

dm e Rt ake Z* Vane Zt [n=k) = |g(n)| <m|f~)I].

Consequently, to show that g ¢ O( f), we need to verify that

Vn €R*      WkeEZ*       Ane Zt [(n=k) A (lg) > ml f()))I.

To accomplish this, we first should realize that m and k are arbitrary, so we have no control
               over their values. The only number over which we have control is the positive integer n
               that we select. Now no matter what the values of m and k happen to be, we can select
               n € Z* such that n > max{5m, k}. Then n > k (actually n > k) andn > 5m > n? > 5mn,
               so |g(n)| =n? > Smn = m|5n| = m| f(n)| andg ¢ O(f).
                   For those who prefer the method of proof by contradiction, we present a second approach.
               Ifg € O(f), then we would have

n* = |g(n)| <m|f(n)| = mn
               for all n > k, where k is some fixed positive integer and m is a (real) constant. But then
               from n* < mn we deduce that n < m. This is impossible because n(€ Zt) is a variable that
               can increase without bound while m is still a constant.

EXAMPLE 5.67     a) Let     f,g:Z*—>R        with     f(n) =5n?4+3n4-1        and     g(n)=n?.     Then   |f(n)| =
                    |Sn? + 3n + 1] = 5n? + 3n +1 <5n? + 3n? +n? = 9n? = 9]e(n)|. Hence for all
                    n>1 (=k), |f(@)| <mlg(n)|             for any m>9,      and     f € O(g). We    can also write
                    f € O(n’) in this case.
                          In addition, |g(n)| = n? < 5n? < 5n* +3n +1 =|f(n)| foralln > 1.S0|g(n)| <
                    m|f(n)| for any m > 1 and all n > k > 1. Consequently g € O(f). [In fact, O(g) =
                    O(f); that is, any function from Z* to R that is dominated by one of f, g is also
                    dominated by the other. We shall examine this result for the general case in the Section
                    Exercises.}
292         Chapter 5 Relations and Functions

b) Now consider f, g: Z* > R with f(n) = 3n? + 7n? — 4n +2 and g(n) = n°. Here
                                   we    have     |f(a)| = |3n? + 7n? — 4n + 2| < |3n3| + [7n?| +] — 4n| + |2| < 303 +
                                   Tn?   + 4n3 4+ 2n? = 16n? = 16|g(n)|, for all n > 1. So with m = 16 and k = 1, we
                                   find that f is dominated by g, and f € O(g), or f € O(n’).
                                       Since 7n —4> 0 for all n> 1, we can write n> <3n3 <3n34+ (In —4)n +2
                                   whenevern > 1. Then |g(n)| < | f(”)| for alln > 1,andg € O(f). [As in part (a), we
                                   also have O(f) = O(g) = O(n) in this case.]

We generalize the results of Example 5.67 as follows. Let f: Z+ — R be the polynomial
                             function where f(n) = a,n' 4+-a,_-jn'~! +---+ aon* + ajn + ao, for a, a1, ..., a,
                             a,,a49 ER, a, #0, t € N. Then

|f(n)| = lan! +a,_yn'!      +--+ 4 aon? +.ajn + aol
                                                         < Jayn'| + |a,_in'"| +--+ + |aon*| + lain| + lao|
                                                         = |a,|n' + Ja,_y|n'—} + = * «+ |a2|n* + |a\|n + |ao|
                                                         < |a;|n' + |ay_i|n’ +--+ + Jag|a’ + Jay|n’ + |ao|n'
                                                         = (lar| + lari] ++ ++ + la2| + lai] + laol)n’.
                             In Definition 5.23, let m = |a,;| + |a;-1| +--+ -+ |a@2| + lai] + |ao| and k = 1, and let
                             g: Zt >R be given by g(n) = n'. Then | f(n)| < ml|g(n)| for all n > k, so f is domi-
                             nated by g, or f € O(n’).
                                It is also true that g € O(f) and that O(f) = O(g) = O(n’).
                                 This generalization provides the following special results on summations.

a) Letf: Z* > R be given by f(n) = 14+243-4----+n. Then (from Examples 1.40
      EXAMPLE 5.68
                                   and 4.1) f(n) = ($) (n)(n +1) = (4) n? + (3) 1, so f € On’).
                               b) If g:Z* > R with g(n) = 1° +27 4.37 4+.--- +n? = (2) (n+ 122 $+ 1) (from
                                  Example 4.4), then g(n) = (4) n° + (4) n? + (z) nand g € O(n).
                                c) If t<¢Z*, and h: Zt >R           is defined by h(n) = }0)_, i, then h(n) = 1°42) +
                                   34.          tni <n    tn    tni+..-4n'       =n(n') =n'*! sohe O(n't),

Now that we have examined several examples of function dominance, we shall close this
                             section with two final observations. In the next section we shall apply the idea of function
                             dominance in the analysis of algorithms.

1) When dealing with the concept of function dominance, we seek the best (or tightest)
                                    bound in the following sense. Suppose that f, g, h: Z* > R, where f € O(g) and
                                    g € O(h). Then we also have f € O(A). (A proof for this is requested in the Section
                                    Exercises.) If h ¢ O(g), however, the statement f € O(g) provides a “better” bound
                                    on | f(#)| than the statement f € O(A). For example, if f(7) = 5, g(n) = 5n, and
                                    h(n) =n’, for all n € Zt, then f € O(g), g € O(h), and f € O(h), buth ¢ O(g).
                                    Therefore, we are provided with more information by the statement f € O(g) than
                                    by the statement f € O(h).
                                 2) Certain orders, such as O(n) and O(n”), often occur when we deal with function
                                    dominance. Therefore they have come to be designated by special names. Some of
                                    the most important of these orders are listed in Table 5.11.
                                                                                                5.7       Computational Complexity                 293

Table 5.11
                                                                      Big-Oh Form                            Name

Od)                                        Constant
                                                           O (log, 7)                                 Logarithmic
                                                           O(n)                                       Linear
                                                           O(n log, n)                                n log, n
                                                           O(n’)                                      Quadratic
                                                           O(n?)                                      Cubic
                                                           O(n”), m=0,            1, 2,3,...          Polynomial
                                                           O(c"), c>1                                 Exponential
                                                           O(n!)                                      Factorial

(Hint:

him             =
                                                                                                    n>     log, n
1. Use the results of Table 5.11 to determine the best “‘big-Oh”
form for each of the following functions f: Z* > R.                      This requires the use of calculus.)

a) f(n) =3n+7                  b) f(x) = 3+ sin(1/n)                  8. Let f, g, 4: Z* — Rwheref € O(g) andg € O(h). Prove
    c) f(n) =n? — 5n? + 25n — 165                                        thatf € O(A).
    d) f(n) = 5n* + 3n log, n                                             9. If g:Z* +R         and       ceR,      we   define   the   function    cg:
    e) fin) =n’? +(n- 1)                                                 Z* >R by (cg)(n) = c(g(n)), for each n € Z*. Prove that
                                                                         if f, g:Z* > Rwith f € O(g), then f € O(cg) forallc ER,
                 n(n + 1)(n + 2)                                         c #0.
    f) fin) =          43)
    g) fn)       =24+4464---42n                                          10. a) Prove that f € O(f) forall f:Z* > R.
  2. Let f, g: Z* — R, where f(n) = nand g(n) =n + (1/n),                     b) Let f, g:Z* > R. If f € O(g) and g € O(f), prove
for n € Z*. Use Definition 5.23 to show that f € O(g) and                     that O(f) = O(g). That is, prove that for all h: Z* > R,
                                                                              if h is dominated by f, then A is dominated by g, and con-
g€ O(f).
                                                                              versely.
  3. Ineach of the following, f, g: Z* — R. Use Definition 5.23
                                                                              c) Iff, g: Z* > R, prove that if O(f) = O(g), then f €
to show that g dominates f.
                                                                              O(g) and  g € O(Ff).
    a) f(n) = 100 log, n, g(n) = (4) n
                                                                         11. The following is analogous to the “big-Oh” notation intro-
    b) f(n) = 2", g(n) = 27" — 1000                                      duced in conjunction with Definition 5.23.
    c) f(n) = 3n?, g(n) = 2" +2n                                             For f, g: Z* — R we say that f is of order at least g if there
  4, Let f, g: Z* > R be defined by f(n) = n + 100, g(n) =               exist constants M € R* andk € Z* such that | f(n)| > Mlg(n)|
n’. Use Definition 5.23 to show that f € O(g) but g ¢ O(f).              forall n € Z*, where n > k. In this case we write f € Q(g) and
                                                                         say that f is “big Omega of g.” So 22(g) represents the set of
5. Let f, g:Z* > R, where f(n) =n’? +n and g(n) =                       all functions with domain Z* and codomain R that dominate g.
(3) n’, forn € Z*. Use Definition 5.23 to show that f € O(g)                 Suppose that f, g,4:Z*t—R, where f(n) = 5n? + 3n,
but g ¢ O(f).                                                            e(n) =n’, h(n) =n, for all n € Z*. Prove that (a) f € 2(g);
                                                                         (b) g € Q(Ff); (c) f € QCA); and (d) A ¢ Q( f) — that is, h is
6. Let f, g: Z* —> R be defined as follows:
                                                                         not “big Omega of f.”
           n,    forn odd                      1,   for n odd
f(a) =                            g(n) =                                12. Let f, g:Z* > R. Prove that f € Q(g) if and only if
            1,   fora even                    nh,   for n even
                                                                         ge O(f).
Verify thatf ¢ O(g) andg ¢ O(f).
                                                                         13. a) Let f:Z* > R where f(n) = )0"_, i. When n = 4,
7. Let f, g: Z* > R where f(n) = n and g(n) = log, a, for                    for example,     we     have     f(n) = f(4)=14+24+3+4+4>
n€ Z*. Show thatg € O(f) but f ¢ O(g).                                        24+34+4>2424+2=3-2=[(44                             1)/2]2=6>
294            Chapter 5 Relations and Functions

(4/2)? = (n/2)?. For n=5, we find f(n) = f(5)=                       14. For f, g:Z* > R, we say that f is “big Theta of g,” and
      1424+344452>34445>3434+3=3-3=                                        write f € @(g), when there exist constants m,, m2 € R* and
      ((5 + 1)/2]3 = 9 > (5/2)* = (n/2)?. In general, f(n) =               k € Z* such that m,|e(n)| < | f()| < m2|g(n)|,for alln € Z*,
      142+---+n>[n/2)+---+n> [n/2]+---4+                                   wheren > k. Prove that f € ©(g) if and only if f € Q(g) and
      [n/2] = [(n + 1)/2] [n/2] > n?/4.                                    f € O(g).
      Consequently, f € Q(n’).
         Use                                                               15. Let f, g: Z* > R. Prove that
                          S       _ a(n +4)                                                f € O(g) if and only if g € O(f).
                           =            2                                  16. a) Let f: Z* — R where f(n) = >-"_, i. Prove that
      to provide an alternative proof that f € Q(n’).                          f €Q(n’).
      b) Let g:Z* > R where g(n) = 0", i2. Prove that                          b) Let g: Z* —> R where g(n) = }0"_, i”. Prove that
      g € Qn).                                                                 2 € O(n).

c) For   t€ Z*,   let A: Zt > R       where   A(n) = yr ,    jt.         c) For    t€Z*,     let   h:Z* +R     where   A(n) = en   i‘.
      Prove that h € Q(n't!).                                                  Prove that h € O(n'*").

5.8
                 Analysis of Algorithms
                                  Now that the reader has been introduced to the concept of function dominance, it is time to
                                  see how this idea is used in the study of algorithms. In this section we present our algorithms
                                  as pseudocode procedures. (We shall also present algorithms as lists of instructions. The
                                  reader will find this to be the case in later chapters.)
                                     We start with a procedure to determine the balance in a savings account.

In Fig. 5.12 we have a procedure (written in pseudocode) for computing the balance in
      EXAMPLE 5.69                a savings account n months (for n € Z*) after it has been opened. (This balance is the
                                  procedure’s output.) Here the user supplies the value of n, the input for the program. The
                                  variables deposit, balance, and rate are real variables, while i is an integer variable. (The
                                  annual interest rate is 0.06.)

procedure      AccountBalance(n:             integer)
                                                    begin
                                                      deposit      := 50.00             The monthly deposit}
                                                      i:=l                              Initializes the counter}
                                                      rate      :=0.005                 The monthly interest rate}
                                                      balance      := 100.00            Initializes the balance}
                                                      while     i <ndo
                                                          begin
                                                             balance      := deposit     + balance       + balance    * rate
                                                              Z:2i41
                                                          end
                                                    end

Figure 5.12

Consider the following specific situation. Nathan puts $100.00 in a new account on
                                  January 1. Each month the bank adds the interest (balance * rate) to Nathan’s account—
                                  on the first of the month. In addition, Nathan deposits an additional $50.00 on the first of
                                                                             5.8 Analysis of Algorithms       295

each month (starting on February 1). This program tells Nathan the balance in his account
                 after n months have gone by (assuming that the interest rate does not change). [Note: After
                 one month, n = 1 and the balance is $50.00 (new deposit) + $100.00 (initial deposit) +
                 ($100.00)(0.005) (the interest) = $150.50. When n = 2 the new balance is $50.00 (new
                 deposit) + $150.50 (previous balance) + ($150.50) (0.005) (new interest) = $201.25.]
                     Our objective is to count (measure) the total number of operations (such as assignments,
                 additions, multiplications, and comparisons) this program segment takes to compute the
                 balance in Nathan’s account » months after he opened it. We shall let f (7) denote the total
                 number of these operations. [Then f: Z* > R. (Actually, f(ZT) ¢ Z*.)]
                    The program segment begins with four assignment statements, where the integer variable
                 i and the real variable balance are initialized, and the values of the real variables deposit
                 and rate are declared. Then the while loop is executed » times. Each execution of the loop
                 involves the following seven operations:
                    1) Comparing the present value of the counter i with n.
                    2) Increasing the present value of balance to deposit + balance + balance * rate; this
                       involves one multiplication, two additions, and one assignment.
                    3) Incrementing the value of the counter by 1; this involves one addition and one as-
                       signment.
                 Finally, there is one more comparison. This is made when i = n + 1, so the while loop is
                 terminated and the other six operations (in steps 2 and 3) are not performed.
                    Therefore, f(n) =4+7n+ 1=7n+5€ O(n). Consequently, we say that f € O(n).
                 For as n gets larger, the “order of magnitude” of 7n + 5 depends primarily on the value n, the
                 number of times the while loop is executed. Therefore, we could have obtained f € O(n)
                 by simply counting the number of times the while loop was executed. Such shortcuts will
                 be used in our calculations for the remaining examples.

Our next example introduces us to a situation where three types of complexity are
                 determined. These measures are called the best-case complexity, the worst-case complexity,
                 and the average-case complexity.

| EXAMPLE 5.70   In this example we examine a typical searching process. Here an array of n (> 1) integers
                 a, A, 43, ..., A, is to be searched for the presence of an integer called key. If the integer
                 is found, the value of location indicates its first location in the array; if it is not found the
                 value of location is 0, indicating an unsuccessful search.
                     We cannot assume that the entries in the array are in any particular order. (If they were,
                 the problem would be easier and a more efficient algorithm could be developed.) The input
                 for this algorithm consists of the array (which ts read in by the user or provided, perhaps,
                 as a file from an external source), along with the number 7 of elements in the array, and the
                 value of the integer key.
                     The algorithm is provided in the pseudocode procedure in Fig. 5.13.
                     We shall define the complexity function f() for this algorithm to be the number of
                 elements in the array that are examined until the value key is found (for the first time) or
                 the array is exhausted (that is, the number of times the while loop is executed).
                     What is the best thing that can happen in our search for key? If key = a1, we find that key
                 is the first entry of the array, and we had to compare key with only one element of the array.
                 In this case we have f(n) = 1, and we say that the best-case complexity for our algorithm
296         Chapter 5 Relations and Functions

procedure           LinearSearch(key,             n:   integer;     a), a,@3,...,a,:      integers)
                           begin
                              i:=1                                                {initializes the counter}
                              while      (i <nand       key    # a,)    do
                                   T:=i4¢+l1
                              if i<nthen location                 :=i             {successful search}
                              else location :=0                                   {unsuccessful search}
                           end   {Jocation        is the      subscript       of the   first    array entry that equals             key;
                                      location is 0 if key is not found}

Figure 5.13

is O(1) (that is, it is constant and independent of the size of the array). Unfortunately, we
                             cannot expect such a situation to occur very often.
                                 From the best situation we turn now to the worst. We see that we have to examine all
                             n entries of the array if (1) the first occurrence of key is a, or (2) key is not found in the
                             array. In either case we have f(n) = n, and the worst-case complexity here is O(n). (The
                             worst-case complexity will typically be considered throughout the text.)
                                 Finally, we wish to obtain an estimate of the average number of array entries examined.
                             We shall assume that the 7 entries of the array are distinct and are all equally likely (with
                             probability p) to contain the value key, and that the probability that key is not in the array
                             1s equal to g. Consequently, we have np + q = 1 and p = (1 — q)/n.
                                 For each 1 <i <n, if key equals a;, then i elements of the array have been examined. If
                             key is not in the array, then all » array elements are examined. Therefore, the average-case
                             complexity is determined by the average number of array elements examined, which is

f(n)=(-p+2-p+3-ptes-+n-
                                             p)tn-g= pl+2+34+---4+n) +ng
                                         _ pr(n +1)
                                         a     ae + nq

Ifg = 0, then key is in the array, p = 1/n and f(n) = (n + 1)/2 € O(n). Forg = 1/2, we
                             have an even chance that key is in the array and f(n) = (1/(2n))[n(n + 1)/2] + (2/2) =
                             (n+ 1)/44+ (7/2) € O(n). [In general, for all 0 < g < 1, we have f(n) € O(n).]

The result in Example 5.70 for the average number of array elements examined in the linear
      EXAMPLE 5.71°          search algorithm may also be calculated using the idea of the random variable. When the
                             algorithm is applied to the array a), a2, a3, ... , a, (ofn distinct integers), we let the discrete
                             random variable X count the number of array elements examined in the search for the integer
                             key. Here the sample space can be considered as {1, 2, 3,...,,*}, where for 1 <i <n,
                             we have the case where key is found to be a; — so that the i elements a, a2, a3, ..., G;
                             have been examined. The entry n* denotes the situation where all n elements are examined
                             but key is not found among any of the array elements a), a2, a3, ... , Gy.
                                 Once again we assume that each array entry has the same probability p of containing
                             the value key and that g is the probability that key is not in the array. Then np + g = 1 and

"This example uses the concept of the discrete random variable which was introduced in the optional material
                             in Section 3.7. It may be skipped without loss of continuity.
                                                                                 5.8 Analysis of Algorithms     297

we have Pr(X =i) = p, for 1 <i <n, and Pr(X = n*) = q. Consequently, the average
                 number of array elements examined during the execution of the linear search algorithm is

E(X)= s iPr(X =i) +nPr(X =n"*)
                                   i=l
                                                                                              pnin + 1) + ng.
                                =)        iptnp = plt+2434---4+n)+ng                     =        5
                                   i=]

Early in the discussion of the previous section, we mentioned how we might want to
                 compare two algorithms that both correctly solve a given type of problem. Such a compar-
                 ison can be accomplished by using the time-complexity functions for the algorithms. We
                 demonstrate this in the next two examples.

The algorithm implemented in the pseudocode procedure of Fig. 5.14 outputs the value of a”
| EXAMPLE 5.72   for the input a, n, where a is areal number and » is a positive integer. The real variable x is
                 initialized as 1.0 and then used to store the values a, a”, a*, ... , a” during execution of the
                 for loop. Here we define the time-complexity function f() for the algorithm as the number
                 of multiplications that occur in the for loop. Consequently, we have f(n) =n € O(n).

procedure          Poweri(a:   real;   n: positive         integer)
                                  begin
                                     X:=1.0
                                     for        i1:=1tondo
                                          X:=x*a
                                  end

Figure 5.14

In Fig. 5.15 we have a second pseudocode procedure for evaluating a” for all ae R,
  EXAMPLE 5.73
                 n€Z*. Recall that |i /2| is the greatest integer in (or the floor of ) i/2.

procedure          Power2(a:   real;   n:   positive       integer)
                                  begin
                                     xX   :=1.0
                                     i:+#n

while      i > 0 do
                                          begin
                                             ifif#2*|i/2|then                 {iis odd}
                                                xX   :=xX*@
                                             i:=|i/2}
                                             if i>0Othen
                                                  a:=ata
                                          end
                                  end

Figure 5.15
298   Chapter 5 Relations and Functions

For this procedure the real variable x is initialized as 1.0 and then used to store the
                       appropriate powers of a until it contains the value of a”. The results shown in Fig. 5.16
                       demonstrate what is happening to x (and a) for the cases where n = 7 and 8. The numbers 1,
                       2, 3, and 4 indicate the first, second, third, and fourth times the statements in the while loop
                       (in particular, the statement i := [i/2]) are executed. If n = 7, then because 272<7 <2,
                       we have 2 < log, 7 < 3. Here the while loop is executed three times and
                                                           3 = [log, 7] +1 <log,7+1,
                       where |log, 7| denotes the greatest integer in log, 7, which is 2. Also, when n = 8, the
                       number of times the while loop is executed is
                                                           4 = |log, 8] + 1 = log, 8+ 1,
                       since log, 8 = 3.

n=7                             n=8

xX:=1.0                         xX:=1.0
                                                 L:=7                            1:=8

X:=x*a         {x= a}       ifr ie’
                                             ifs:                                a:i=a*a

fin?==]
                                                  a@:=a*ta                              3
                                                                                1i=
                                                 X:=x*a         {x= a}                        =
                                             afd                                 ze=l

a@:=a*a                     a

afe ix             {x = a’)         x:=x*a           {x= a}
                                                1:=0                         4   i1:=0
                                                  [x=al=a-a’- ai]                [x=        (((a)*)?)7]
                                          Figure 5.16

We shall define the time-complexity function g(n) for (the implementation of) this
                        exponentiation algorithm as the number of times the while loop is executed. This is
                        also the number of times the statement i := [i/2]| is executed. (Here we assume that
                        the time interval for the computation of each |i/2] is independent of the magnitude
                        of i.) On the basis of the foregoing two observations, we want to establish that for all
                        n> 1, g(n) <log,n +1 © O(log, 1). We shall establish this by the Principle of Mathe-
                        matical Induction (the alternative form— Theorem 4.2) on the value of n.
                           When n = 1, we see in Fig. 5.15 that i is odd, x is assigned the value of a = a', and
                        a‘ is determined after only 1 = log, | +1          execution of the while loop. So g(1) =1<
                        log, 1+ 1.
                             Now assume that for all 1 <n <k, g(n) < log) n + 1. Then for n = k + 1, during the
                                                                                            k+1
                        first pass through the while loop the value of 7 is changed to       +}      Since 1 <

k+1                                                                          k+1
                                  < k, by the induction hypothesis we shall execute the while loop g (|“=*})
                                                                                                         2
                              .                  k+1               k+1
                        more times, where g       =>       < log,    =z     +1.
                                                           5.8 Analysis of Algorithms      299

Therefore
                         k+1                            k+1
ck +1) <14 (tog, [AS *] +1] c1+[toe. (“S) +1

= 1+ [log,(k + 1) — log, 2+ 1] = log,(K +1) +1.
    For the time-complexity function of Example 5.72, we found that f (7) € O(n). Here we
have g{n) € O(log, n). It can be verified that g is dominated by f but f is not dominated
by g. Therefore, for large n, this second algorithm is considered more efficient than the first
algorithm (of Example 5.72). (However, note how much easier the pseudocode in Fig. 5.14
is than that of the procedure in Fig. 5.15.)

In closing this section, we shall summarize what we have learned by making the following
observations.

1) The results we established in Examples 5.69, 5.70, 5.72, and 5.73 are useful when
      we are dealing with moderate to large values of n. For small values of n, such con-
      siderations about time-complexity functions have little purpose.
   2) Suppose that algorithms A, and Az have time-complexity functions f(n) and g(n),
      respectively, where f(n) € O(n) and g(n) € O(n”). We must be cautious here. We
      might expect an algorithm with linear complexity to be “perhaps more efficient” than
      one with quadratic complexity. But we really need more information. If f(n) = 1000”
      and g(n) =n’, then algorithm A? is fine until the problem size n exceeds 1000. If
      the problem size is such that we never exceed     1000, then algorithm A> is the better
      choice. However, as we mentioned in observation 1, as n grows larger, the algorithm
      of linear complexity becomes the better alternative.
   3) In Fig. 5.17 we have graphed a log-linear plot for the functions associated with some
      of the orders given in Table 5.11. [Here we have replaced the (discrete) integer variable
      n by the (continuous) real variable n.} This should help us to develop some feeling
      for their relative growth rates (especially for large values of 7).

F(n) *

= log n

Figure 5.17
300            Chapter 5 Relations and Functions

The data in Table 5.12 provide estimates of the running times of algorithms for certain
                                orders of complexity. Here we have the problem sizes n = 2, 16, and 64, and we assume
                                that the computer can perform one operation every 107° second = 1 microsecond (on
                                the average). The entries in the table then estimate the running times in microseconds.
                                For example, when the problem size is 16 and the order of complexity is n log, n, then
                                the running time is a very brief 16 log, 16 = 16-4 = 64 microseconds; for the order of
                                complexity 2”, the running time is 6.5 X 10* microseconds = 0.065 seconds. Since both of
                                these time intervals are so short, it is difficult for a human to observe much of a difference
                                in execution times. Results appear to be instantaneous in either case.

Table 5.12

Order of Complexity

Problem sizen | log, n           n       nlog,n               n                2"            n!
                                                2            1            2          2                   4            4           2
                                            16               4       16             64             256        =. 6.5 &: 104   2.1 x 10%
                                            64               6       64            384            4096          1.84 x 10!9     > 1089

However, such estimates can grow rather rapidly. For instance, suppose we run a program
                                for which the input is an array A of n different integers. The results from this program are
                                generated in two parts:

1) First the program implements an algorithm that determines the subsets of A of
                                       size 1. There are n such subsets.
                                   2) Then a second algorithm is implemented to determine all the subsets of A. There are
                                      2” such subsets.

Let us assume that we have a computer that can determine each subset of A in a mi-
                                crosecond. For the case where |A| = 64, the first part of the output is executed almost
                                instantaneously — in approximately 64 microseconds. For the second part, however, Table
                                5.12 indicates that the amount of time needed to determine all the subsets of A will be about
                                1.84 < 10!° microseconds. We cannot be too content with this result, however, since
                                                    1.84 x 10° microseconds = 2.14 x 10° days = 5845 centuries.

eee                                                  b)    beg:
                                                                                   )Pegin
                                                                                          fori     :=1tondo
  1. In each of the following pseudocode program segments,
                                                                                            for j :=lton*ndo
the integer variables i, j, 1, and sum are declared earlier in the
                                                                                                 Sum     :=   sum+1
program. The value of n (a positive integer) is supplied by the
user prior to execution of the segment. In each case we define                      end
the time-complexity function f(#) to be the number of times                   ¢) begin
the statement sum := sum + 1 is executed. Determine the best                        sum          := 0;
“big-Oh” form for f.                                                                     for i :=1tondo
      a) begin                                                                              for j :=itondo
            sum   :=0                                                                          sum := sum+1
                                                                                    end
            for i:=1tondo
               for j :=1tondo                                                 d) begin
                  sum   := sum+1                                                         sum := 0
         end                                                                             i:=n
                                                                                                                   5.8 Analysis of Algorithms                  301

while        i > 0 do                                                                   8 — 10x + 7x? — 2x3 + 3x7 4 12x°,
             begin                                                              when x is replaced by an arbitrary (but fixed) real number r.
                   Sum :=            sum+1                                         For this particular instance, n = 5 and ap = 8, a; = —10,
                   i:=           [i/2]                                          @    = 7, a; = —2, ag = 3, andas               =    12.
                end
       end                                                                                  procedure PolynomialEvaluationl
    e) begin                                                                                   (nm: nonnegative integer;
                                                                                              r,a,@,@,-.-,                    a:    real)
          sum := 0
          for          i:=l1tondo                                                           begin
             begin                                                                            product          :=1.0
                jian                                                                          value      := ap

while j > 0 do                                                                 fori:=1tondo
                  begin                                                                            begin

Sum :=      Sum+1
                                                                                                      product := product * r
                                                                                                      value := value +a, * product
                                 j := (79/2!
                           end                                                                     end
                end                                                                         end

end
                                                                                      a) How many additions take place in the evaluation of
2. The following pseudocode                  procedure implements     an al-         the given polynomial? (Do not include the n — 1 additions
gorithm for determining the maximum                        value in an array          needed to increment the loop variable 7.) How many mul-
@, 42, 43,..., 4, Of integers. Here n > 2 and the entries in                          tiplications?
the array need not be distinct.                                                       b) Answer the questions in part (a) for the general polyno-
          procedure                  Maximum        (n:   integer;                    mial
            al, 42,a3,...,a,:                   integers)                                                      2          3                     -|
                                                                                          ay + a,x
                                                                                                 + ax”             + agx? +--+
                                                                                                                            + Gy yx"                 + a,x",
          begin
             Max       i=       aj                                                    where dy, @1, 42, 43, .-.» Gn—1, An are real numbers and n
             for       i        :=2tondo                                              is a positive integer.
                  if       a,    > max       then                                 6. We first note how the polynomial in the previous exercise
                           max       := a,                                      can be written in the nested multiplication method:
          end
                                                                                            8+x(-104+          x(7 +.x(—-24
                                                                                                                          «(3 4 12x)))).
   a) If the worst-case complexity function f (7) for this seg-
   ment is determined by the number of times the comparison                     Using this representation, the following pseudocode procedure
   a, > max is executed, find the appropriate “big-Oh” form                     (implementing Horner’s method) can be used to evaluate the
   for f.                                                                       given polynomial.

b) What can we say about the best-case and average-case                                  procedure              PolynomialEvaluation2
   complexities for this implementation?                                                          (nm: nonnegative                 integer;
3. a) Write a computer program (or develop an algorithm) to                                  LC, ay, a1, a2,-+.+-,
                                                                                                                 an: real)
    locate the first occurrence of the maximum value in an array                            begin
    a), 2, 43,..., a, of integers. (Heren € Z* and the entries                                   value
                                                                                                    := a,
    in the array need not be distinct.)                                                          forj :=n-            1 downto
                                                                                                                            0 do
                                                                                                    value      :=a,+r*               value
    b) Determine the worst-case complexity function for the
                                                                                            end
    implementation developed in part (a).
4, a) Write a computer program (or develop an algorithm) to                    Answer the questions in parts (a) and (b) of Exercise 5 for the
    determine the minimum and maximum values in an array                        new procedure given here.
    a}, 42, 43,...,@, of integers. (Here n € Z* with n > 2,                         7, Let a,, a2, a3, ... be the integer sequence               defined recur-
    and the entries in the array need not be distinct.)                         sively by
    b) Determine the worst-case complexity function for the
                                                                                       1) a, = 0; and
    implementation developed in part (a).
                                                                                       2) Fora     > 1, @, =       1+ @jn/2).
5. The following pseudocode procedure can be used to eval-
uate the polynomial                                                             Prove that a, = [log, nj for alln € Z*.
302            Chapter 5 Relations and Functions

8. Let a, a2, a3, ... be the integer sequence defined recur-                     suppose the probability that key has the value a, isi/[n(m + 1)],
sively by                                                                         for | <i <n. Under these circumstances, what is the average
                                                                                  number of array elements examined?
      1) a, = 0; and
                                                                                  11. a) Write a computer program (or develop an algorithm)
      2) Forn > 1, a, = 14+ jn).
                                                                                      to determine the location of the first entry in an array
Find an explicit formula for a, and prove that your formula is                        a1, 42, 43,..-, 4, of integers that repeats a previous en-
correct.                                                                              try in the array.
  9. Suppose the probability that the integer key is in the array                          b) Determine the worst-case complexity          for the imple-
a), 42, 43,..., A, (ofn distinct integers) is 3/4 and that each                            mentation developed in part (a).
array element has the same probability of containing this value.                  12.      a) Write a computer program (or develop an algorithm)
If the linear search algorithm of Example 5.70 is applied to this                          to determine the location of the first entry @, in an array
array and value of key, what is the average number of array                                a|, 42, 43,   ..., d, of integers, where a, < a,_}.
elements that are examined?
                                                                                           b) Determine the worst-case complexity          for the imple-
10. When the linear search algorithm is applied to the array                               mentation developed in part (a).
Q\, Q,43,..., a, (of n distinct integers) for the integer key,

5.9
           Summary and Historical Review
                                In this chapter we developed the function concept, which is of great importance in all areas
                                of mathematics. Although we were primarily concerned with finite functions, the definition
                                applies equally well to infinite sets and includes the functions of trigonometry and calculus.
                                However, we did emphasize the role of a finite function when we transformed a finite set
                                into a finite set. In this setting, computer output (that terminates) can be thought of as a
                                function of computer input, and a compiler can be regarded as a function that transforms a
                                (source) program into a set of machine-language instructions (object program).
                                    The actual word function, in its Latin form, was introduced in 1694 by Gottfried Wil-
                                helm Leibniz (1646-1716) to denote a quantity associated with a curve (such as the slope
                                of the curve or the coordinates of a point of the curve). By 1718, under the direction of
                                Johann Bernoulli (1667-1748), a function was regarded as an algebraic expression made
                                up of constants and a variable. Equations or formulas involving constants and variables

ee
                                                                    i   ze
                                                                                  %   4,

Gottfried Wilhelm Leibniz (1646-1716)
                                                     5.9 Summary and Historical Review             303

came later with Leonhard Euler (1707-1783). His is the definition of “function” generally
found in high school mathematics. Also, in about 1734, we find in the work of Euler and
Alexis Clairaut (1713-1765) the notation f(x), which is still in use today.
    Euler’s idea remained intact until the time of Jean Baptiste Joseph Fourier (1768-1830),
who found the need for a more general type of function in his investigation of trigonometric
series. In 1837, Peter Gustav Lejeune Dirichlet (1805-1859) set down a more rigorous
formulation   of the concepts   of variable,   function,   and the correspondence        between    the
independent variable x and the dependent variable y, when y = f(x). Dirichlet’s work
emphasized the relationship between two sets of numbers and did not call for the existence
of a formula or expression connecting the two sets. With the developments in set theory
during the nineteenth and twentieth centuries came the generalization of the function as a
particular type of relation.

Peter Gustav Lejeune Dirichlet (1805-1859)

In addition to his fundamental work on the definition of a function, Dirichlet was also
quite active in applied mathematics and in number theory, where he found need for, and
was the first to formally state, the pigeonhole principle. Consequently, this principle is
sometimes referred to as the Dirichlet drawer principle or the Dirichlet box principle.
    The nineteenth and twentieth centuries saw the use of the special function, one-to-one
correspondence, in the study of the infinite. In about 1888, Richard Dedekind (1831-1916)
defined an infinite set as one that can be placed into a one-to-one correspondence with a
proper subset of itself. [Galileo (1564-1642) had observed this for the set Z*.] Two infinite
sets that could be placed in a one-to-one correspondence with each other were said to have
the same transfinite cardinal number. In a series of articles, Georg Cantor (1845-1918)
developed the idea of levels of infinity and showed that |Z| = |Q| but |Z| < |R|. A set A
with |A| = |Z| is called countable, or denumerable, and we write |Z| = No as Cantor did,
using the Hebrew letter aleph, with the subscripted 0, to denote the first level of infinity. To
show that |Z| < |R|, or that the real numbers were uncountable, Cantor devised a technique
now referred to as the Cantor diagonal method. (More about the theory of countable and
uncountable sets can be found in Appendix 3.)
    The Stirling numbers of the second kind (in Section 5.3) are named in honor of James
Stirling (1692-1770), a pioneer in the development of generating functions, a topic we
will investigate later in the text. These numbers appear in his work Methodus Differentialis,
published in London in 1730. Stirling was an associate of Sir Isaac Newton (1642-1727) and
304   Chapter 5 Relations and Functions

was using the Maclaurin series in his work 25 years before Colin Maclaurin (1698-1746).
                       However, although his name is not attached to this series, it appears in the approximation
                       known as Stirling’s formula: n! = (27n)!/2e-"n", which, as justice would have it, was
                       actually developed by Abraham DeMoivre (1667-1754).
                            Using the counting principles developed in Section 5.3, the results in Table 5.13 extend
                       the ideas that were summarized in Table 1.11. Here we count the number of ways it is
                       possible to distribute m objects into n containers, under the conditions prescribed in the
                       first three columns of the table. (The cases wherein neither the objects nor the containers
                       are distinct will be covered in Chapter 9.)

Table 5.13

Objects | Containers          Some                          Number
                               Are         Are         Container(s)                         of
                            Distinct    Distinct      May Be Empty                    Distributions

Yes          Yes             Yes                               n™

Yes          Yes              No                         n! S(m, n)
                              Yes           No             Yes            S(m, 1) + S(m, 2) +---+
                                                                                               S(m, n)
                              Yes           No              No                             S(m, n)

No          Yes             Yes                       ("tr            t)
                                                                                         m

No          Yes              No              n+(m—n)—           1      =      m—I
                                                                                 (m — n)                     m—n
                                                                                                      _{m— 1
                                                                                                         n—-1]

Finally, the “big-Oh” notation of Section 5.7 was introduced by Paul Gustav Heinrich
                       Bachmann (1837-1920) in his book Analytische Zahlentheorie, an important work on num-
                       ber theory, published in 1892. This notation has become prominent in approximation theory,
                       in such areas as numerical analysis and the analysis of algorithms. In general, the notation
                       f € O(g) denotes that we do not know the function f explicitly but do know an upper
                       bound on its order of magnitude. The “big-Oh” symbol is sometimes referred to as the Lan-
                       dau symbol, in honor of Edmund Landau (1877-1938), who used this notation throughout
                       his work.
                           Further properties of the Stirling numbers of the second kind are given in Chapter 4
                       of D. I. A. Cohen    [3] and in Chapter 6 of the text by R. L. Graham,             D. E. Knuth,   and
                       O. Patashnik [7]. The article by D. J. Velleman and G. S. Call [11] provides a very interesting
                       introduction to the Stirling numbers of the second kind —as well as the Eulerian numbers
                       introduced in Example 4.21. For more on infinite sets and the work of Georg Cantor,
                       consult Chapter 8 of H. Eves and C. V. Newsom [6] or Chapter IV of R. L. Wilder [12].
                       The presentation in the book by J. W. Dauben [5] covers the controversy surrounding set
                       theory at the turn of the century and shows how certain aspects of Cantor’s personal life
                       played such an integral part in his understanding and defense of set theory.
                          More examples that demonstrate how to apply the pigeonhole principle are given in
                       the articles by K. R. Rebman    [9] and A. Soifer and E. Lozansky      [10]. Further results and
                                                                                                           Supplementary Exercises                305

extensions on problems arising from the principle are covered in the article by D. S. Clark
                                 and J. T. Lewis [2]. During the twentieth century a great deal of research has been de-
                                 voted to generalizations of the pigeonhole principle, culminating in the subject of Ramsey
                                 theory, named for Frank Plumpton Ramsey (1903-1930). An interesting introduction to
                                 Ramsey theory can be found in Chapter 5 of D. I. A. Cohen [3]. The text by R. L. Graham,
                                 B. L. Rothschild, and J. H. Spencer [8] provides further worthwhile information.
                                    Extensive coverage on the topic of relational data bases can be found in the work of
                                 C. J. Date [4]. Finally, the text by S. Baase and A. Van Gelder [1] is an excellent place to
                                 continue the study of the analysis of algorithms.

REFERENCES

{. Baase, Sara, and Van Gelder, Allen.     Computer Algorithms: Introduction to Design & Analysis,
                                        3rd ed. Reading, Mass.: Addison-Wesley, 2000.
                                     2. Clark, Dean S., and Lewis, James T. “Herbert and the dungarian Mathematician: Avoiding
                                        Certain Subsequence Sums.” The College Mathematics Journal 2) (March 1990): pp. 100-
                                        104.
                                     3. Cohen, Daniel I. A. Basic Techniques of Combinatorial Theory. New York: Wiley, 1978.
                                     4. Date,   C. J. An   Introduction   to Database   Systems,    7th ed. Boston,       Mass.:       Addison-Wesley,
                                        2002.
                                     5. Dauben, Joseph Warren. Georg Cantor: His Mathematics and Philosophy of the Infinite.
                                        Lawrenceville, N. J.: Princeton University Press, 1990.
                                     6. Eves, Howard, and Newsom, Carroll V. An Introduction to the Foundations and Fundamental
                                        Concepts of Mathematics, rev. ed. New York: Holt, 1965.
                                     7. Graham, Ronald L., Knuth, Donald E., and Patashnik, Oren. Concrete Mathematics,                         2nd ed.
                                         Reading, Mass.: Addison-Wesley, 1994.
                                      8. Graham, Ronald L., Rothschild, Bruce L., and Spencer, Joel H. Ramsey Theory, 2nd ed. New
                                         York: Wiley, 1980.
                                     9. Rebman, Kenneth R. “The Pigeonhole Principle (What it is, how it works, and how it applies
                                         to map coloring).” The Two-Year College Mathematics Journal, vol. 10, no. 1 (January 1979):
                                         pp. 3-13.
                                    10. Soifer, Alexander, and Lozansky, Edward, “Pigeons in Every Pigeonhole.” Quantum (January
                                         1990): pp. 25-26, 32.
                                    11. Velleman, Daniel J., and Call, Gregory S. “Permutations and Combination Locks.” Mathemat-
                                         ics Magazine 68 (October 1995): pp. 243-253.
                                    12. Wilder, Raymond L. Introduction to the Foundations of Mathematics, 2nd ed. New York:
                                         Wiley, 1965.

b) If f: A —     B is a one-to-one correspondence and A, B
          SUPPLEMENTARY EXERCISES                                              are finite, then A = B.
                                                                               c) If f: A—      B is one-to-one, then f is invertible.

1. Let A, B ©.     Prove that                                                 d) If f: A >     B is invertible, then f is one-to-one.
    a) (A X B)N (BX      A) = (ANB)        X (ANB): and                        e) If f: A— B is one-to-one                and      g,4:B—>C        with
                                                                               gof =ho f,theng  =h.
    b) (A X B)U(B
              X A) C(AUB)                  X (AUB).
                                                                               f) If f:A—> B         and     A,, ACA,           then     f(A; M A2) =
2. Determine whether each of the following statements
                                                  1s true                      F(A)     A f (Az).
or false. For each false statement give a counterexample.                      g) If f: A>      B   and    By,   By CB,     then f7'(B)N By) =
    a) If f: A—   B and (a, b), (a,c) € f, thenb =c.                           f-' (By) 0 f 7! (Bo).
306            Chapter 5 Relations and Functions

3. Let f: RR      where f(ab)=af(b)+ bf(a), for all                 the chores be assigned if Thomas, as the eldest, must mow the
a,b ER. (a) What is f(1)? (b) What is f(0)? (c) Ifne Zt,             lawn (one of the ten weekly chores) and no one is allowed to
a €R, prove that f(a") = na" f(a).                                   be idle?
4. Let A, B CN with 1 < |A| < |B|. If there are 262,144 re-         17. Letn €N, n > 2. Show that S(n, 2) = 2”-! — 1.
lations from A to B, determine all possibilities for | A| and |B.    18. Mrs. Blasi has five sons (Michael, Rick, David, Kenneth,
  5. If U,, UW, are universal sets with A, BC U,, and C, DC          and Donald) who enjoy reading books about sports. With Christ-
Us, prove that (AM B) X (CN D) = (A X C)N(B X D).                    mas approaching, she visits a bookstore where she finds 12 dif-
6. Let A = {1, 2,3, 4,5}       and B= (1, 2, 3,4, 5, 6}. How        ferent books on sports.
many one-to-one functions       f: A —> B satisfy (a) f(1) = 3?            a) In how many ways can she select nine of these books?
(b) fC) = 3, f(2) = 6?                                                     b) Having made her purchase, in how many ways can she
7. Determine all real numbers x for which                                 distribute the books among her sons so that each of them
                                                                           gets at least one book?
                          x? — |x| = 1/2.
                                                                            ¢) Two of the nine books Mrs. Blasi purchased deal with
8. Let R C Z* X Z* be the relation given by the following                 basketball, Donald’s favorite sport. In how many ways can
recursive definition.                                                      she distribute the books among her sons so that Donald gets
                                                                           at least the two books on basketball?
      1) (1, 1) ER; and
                                                                     19,   Let m,n €Z* with n>. (a) In how many ways can
      2) For all (a, b) € KR, the three ordered pairs (a + 1, b),
                                                                     one    distribute n distinct objects among m different contain-
      (a+1,6+     1), and (a+    1, b+ 2) are also
                                                in R.
                                                                     ers   with no container left empty? (b) In the expansion of
Prove that 2a > b for all (a, by E R.                                (x;   t x2 +---+-,,)", what is the sum of all the multino-
9. Let a, b denote fixed real numbers and suppose that f/f:         mial coefficients (sens. atm) wheren; +2 +---+n,, = nand
R > R is defined by f(x) = a(x + b) — b, x ER. (a) Deter-            n; > Ofor all 1 <i <m?
mine f?(x) and f(x). (b) Conjecture a formula for f"(x),             20. Ifn € Z* withn > 4, verify that S(n, n — 2) = (3) + 3(4).
where n € Z*. Now establish the validity of your conjecture.         21. If f: A—           A, prove that for all m,ne Z*, f"o f" =
10. Let A;, A and B be sets with {1,2,3,4,5} =A, CA,                 fo f™. (First let m = 1 and induct on n. Then induct on m.
B=({s,t,u,v,w,x},         and f: A, >   B. If f can be extended      This technique is known as double induction.)
to A in 216 ways, what is |A|?                                       22. Let f: X >          Y, and for eachi € /, let A, C X. Prove that
11. Let A = {1, 2, 3,4, S} and B = {t, u,v, w, x, y, z}. (@ If             a)   f   (rer     A,)    =         ies   f(A,).
afunction f: A > B is randomly generated, what is the prob-
                                                                           b)   f   (Nhe     A,)    €         ey    f(A,).
ability that it is one-to-one? (b) Write a computer program (or
develop an algorithm) to generate random functions f: A > B                c) f (Mer A;)            =(ic,           f(A,),      for f one-to-one.
and have the program print out how many functions it generates       23. Given a nonempty set A, let f: A—                               A and g:A—>A
until it generates one that is one-to-one.                           where
12. Let S be a set of seven positive integers the maximum of
which is at most 24. Prove that the sums of the elements in all                 f(a)
                                                                                 = g(f(f@))                         and      g(a) = f(e(f@))
the nonempty subsets of S cannot be distinct.
                                                                     for all a in A. Prove that f = g.
13. In a ten-day period Ms. Rosatone typed 84 letters to differ-
ent clients. She typed 12 of these letters on the first day, seven   24. Let A be a set with |A| = 7.
on the second day, and three on the ninth day, and she finished            a) How many closed binary operations are there on A?
the last eight on the tenth day. Show that for a period of three           b) A closed ternary (3-ary) operation on A is a function
consecutive days Ms. Rosatone typed at least 25 letters.                   f: AX AX A-—       A. How many closed ternary operations
14. If {x;, x, ..., x7} C Z*, show that for somei # , either               are there on A?
x, +x, or x, — x, 1s divisible by 10.                                      c) A closed k-ary operation on A is a function f: A; X
15. Letn € Z*, n odd. If i), iz, ..., i, is a permutation of the           Az X++:         X Ay >        A, where
                                                                                                                A, = A, forall 1 <1<                   k.How
integers 1, 2,...,, prove that (1 — i;)(2 — 12) +++ (n —i,) is             many closed k-ary operations are there on A?
an even integer. (Which counting principle is at work here?)               d) Aclosed k-ary operation for A is called commutative if
16. With both of their parents working, Thomas, Stuart, and
Craig must handle ten weekly chores among themselves. (a) In                    f(a,    a@2,..     "5   ay)     =   f (r(ay),   (a2),     none   » a(ax)),
how many ways can they divide up the work so that everyone
is responsible for at least one chore? (b) In how many ways can            where       4, d2,...,@,€A                     (repetitions     allowed),         and
                                                                                                                   Supplementary Exercises                            307

(a\), 7(a2),..., (ay)             is        any   rearrangement     of       c) Determine             2~*,      2?   3       x"       (n>2),         07,          073,
    a, @2,..., @x. How        many    of the closed k-ary operations             a-"(n>2),              where,     for   example,            a7* 2 = a7 log-l=
    on A are commutative?                                                        (a oa@)~' = (a”)7'. (See Supplementary Exercise 30.)
25. a) Let S = {2, 16, 128, 1024, 8192, 65536}. If four num-                 32. Forn € Z*, define r: Z* — Z* by t(n) = the number of
    bers are selected from S, prove that two of them must have               positive-integer divisors of 7.
    the product 131072.                                                                        py€2 pS...
                                                                                 a) Let n = pj)’      _€3
                                                                                                          pi,                     where pi, po, p3,-.-. Pr
    b) Generalize the result in part (a).                                        are distinct primes and e, is a positive integer for all
26. If Wis a universe and A C , we define the characteristic                     1<i<k. Whatis t(n)?
function of A by x4: U — {0, 1}, where                                           b) Determine the three smallest values of n € Z* for which
                                                                                 t(n) = k, where k = 2, 3, 4, 5, 6.
                                 1,             xeA
                    Xa) =                                                        ¢) For all k € Z*, k > 1, prove that t~!(k) is infinite.
                                0,              xGA
                                                                                 d) If a,be€Z*             with gcd(a, b) = 1, prove that t(ab) =
For sets A, B CU, prove each of the following:
                                                                                 t(a)t(b).
    a) Xane = Xa * Xa, where (Xa + Xe)(%) = Xa(X) + XB)                      33. a) How    many subsets A = {a,b,c,d}CZ*,                                            where
    b) Xaus = Xa + XB — Xan                                                      a, b, c,d > 1, satisfy the propertya-b-c-d=
    ©) xx =1—- Xa,        where (1 — Xa)X) = 1) — xa) =                          2-3-5-7-31-13-17- 19?
    1 ~ xa(x)                                                                    b) How               many © subsets         A=       {@),   @,..       1, Am} CZ,
                                                                                                                                                                mt
    (For %U finite, placing the elements of “U in a fixed order re-              where a, > 1, 1 <i <™m, satisfy the property Tt,                                     a; =
sults in a One-to-one correspondence between subsets A of U                     [L-:
                                                                                       n

p,, where the p,, | < j <n, are distinct primes and
                                                                                                                             .                 .   .        *

and the arrays of 0’s and 1’s obtained as the images of “U under                 n>m?
Xa. These arrays can then be used for the computer storage and
                                                                             34, Give anexample of a function f: Z* — R where f € O(1)
manipulation of certain subsets of WU.)
                                                                             and f is one-to-one. (Hence / is not constant.)
27. With   A = {x, y, z}, let f, g: A> A              be given   by   f =
                                                                             35. Let f, g: Z* — R where
{(, y),      2), (2, XE, 8 = {O, ¥), CY, x), (, Z}- Determine
each of the following: fog, gof, f-',2', @of),                                                   2,    form even                              3,       for n even
flog        ,andg    !of7'.                                                       rn       =
                                                                                                                                 g(n) =
                                                                               fin)              1,    forn odd                               4,       for n odd
28. a) If f: R > Ris defined by f(x) = 5x +3, find f~'(8).
    b) Ifg: R > R, where g(x) = Ix? + 3x + II, find g~'(1).                  Prove or disprove each of the following: (a) f € O(g); and

c) For 4: R > R, given by                                                (b) g € O(f).
                                                                             36. For f, g: Z* > R we define f + g:Z* > R by
                                            x
                           h(x) = |                                          (f + g)(n) = f(n) + g(n), for n € Z*. [Note: The plus sign
                                           x +2)’
                                                                             in f + g is for the addition of the functions f and g, while
    find h~' (4).                                                            the plus sign in f(n) + g() is for the addition of the real num-
29. If A= {1, 2, 3, ..., 10}, how many functions f: A—> A                    bers f(n) and g(n).]
(simultaneously)   satisfy  f—'({1, 2, 3) =@, f-'({4, 5}) =                      a) Let f\, 2::Z* > R with f € O(/;) andg € O(g)). If
(1,3, 7}, and f-!({8, 10}) = {8, 10}?                                            fi(n) = 0, g)(n) = 0, for all n € Z*, prove that (f + g) €
30. Let f: A > A be an invertible function. For 2 € Z* prove                     Of; + a1).
that (f”)~' = (f7')". [This result can be used to define f~" as                  b) If the conditions f;(n) > 0, g,(n) > 0, for alln € Zt,
either (f”)~! or (f7')".]                                                        are not satisfied, as in part (a), provide a counterexample to
31. In certain programming languages, the functions pred and                     show that
succ (for predecessor and successor, respectively) are functions
from Z to Z where pred(x) = w(x) = x — 1 and suce(x) =                                         fe Of), 8 € Og) ACS +9) € OCF) + g1).
o(x)=x4+1.                                                                   37. Let a,b € R*, with a, b> 1. Let f, g:Z* > R be de-
    a) Determine (7 00)(x), (9 o7)(x).                                       fined by f(n) = log, n, g(n) = log,n. Prove that f € O(g)
    b) Determine 77, 73, 2"(n > 2), 0%, 03, o"(n > 2).                       and g € O(f). [Hence O(log, n) = O(log, 7).]
Languages: Finite
  State Machines

I: this era of computers and telecommunications, we find ourselves confronted every day
                      with input-output situations. For example, in purchasing a package of chewing gum from
                  a vending machine, we input some coins and then press a button to get our expected output,
                  the package of chewing gum we desire. The first coin that we input sets the machine in
                  motion. Although we usually don’t care about what happens inside the machine (unless
                  some kind of breakdown occurs and we suffer a loss), we should realize that somehow the
                  machine is keeping track of the coins we insert, until the correct total has been inserted.
                  Only then, and not before, does the vending machine output the desired package of chewing
                  gum. Consequently, for the vendor to make the expected profit per package of chewing gum,
                  the machine must internally remember, as each coin is inserted, what sum of money has
                  been deposited.
                       Acomputer is another example of an input-output device. Here the input is generally some
                  type of information, and the output is the result obtained after processing this information.
                  How the input is processed depends on the internal workings of the computer; it must
                  have the ability to remember past information as it works on the information it is currently
                  processing.
                       Using the concepts we developed earlier on sets and functions, in this chapter we shall
                  investigate an abstract model called a finite state machine, or sequential circuit. Such circuits
                  are one of two basic types of control circuits found in digital computers. (The other type is
                  a combinational circuit or gating network, which is examined in Chapter 15.) They are also
                  found in other systems such as our vending machine, as well as in the controls for elevators
                  and in traffic-light systems.
                      As the name indicates, a finite state machine has a finite number of internal states where
                  the machine remembers certain information when it is in a particular state. However, before
                  getting into this concept we need some set-theoretic material in order to talk about what
                  constitutes valid input for such a machine.

6.1
Language: The Set Theory of Strings
                  Sequences of symbols, or characters, play a key role in the processing of information by a
                  computer. Inasmuch as computer programs are representable in terms of finite sequences
                  of characters, some algebraic way is needed for handling such finite sequences, or strings.
                      Throughout this section we use & to denote a nonempty finite set of symbols, collectively
                  called an alphabet. For example, we may have © = {0, 1} or © = {a, b, , d, e}.

309
310          Chapter 6 Languages: Finite State Machines

In any alphabet &, we do not list elements that can be formed from other elements
                              of & by juxtaposition (that is, if a, b € X, then the string ab is the juxtaposition of the
                              symbols a and b). As a result of this convention, alphabets such as & = {0, 1, 2, 11, 12}
                              and & = {a, b, c, ba, aa} are not considered. (In addition, this convention will help us later
                              in Definition 6.5, when we talk about the length of a   string.)
                                Using an alphabet © as the starting place, we can construct strings from the symbols of
                              x in a systematic manner by using the following idea.

Definition 6.1          If © is an alphabet and n € Z*, we define the powers of & recursively as follows:

1) x! = X; and
                                  2) ="t! = {xy|x € E, y € E"}, where xy denotes the juxtaposition ofx and y.

|     EXAMPLE 6.1             Let & be an alphabet.
                                  If n =2, then ©? = {xy|x, ye E}. For instance, with © = {0,1} we find b? =
                              {00, O1, 10, 11}.
                                  When n = 3, the elements of ©? have the formuv, whereu € © andv € ¥2. Butsince we
                              know the form of the elements in £7, we may also regard the strings in £7 as sequences of the
                              form uxy, where u, x, y € X&. As an example for this case, suppose that & = {a, b, c, d, e}.
                              Then &? would contain 5° = 125 three-symbol strings — among them aaa, ach, ace, cdd,
                              and eda.
                                 In general, for alln € Z* we find that |="| = |=|" because we are dealing with arrange-
                              ments (of size 2) where we are allowed to repeat any of the |X| objects.

Now that we have examined &” for       € Z*, we shall look into one more power of Z.

Definition 6.2          For an alphabet © we define £° = {A}, where A denotes the empty string —that is, the
                              string consisting of no symbols taken from &.

The symbol A is never an element in our alphabet ©, and we should not mistake it for
                              the blank (space) that is found in many alphabets.
                                  However, although A ¢ &, we do have # C X&, so we need to be cautious here. We observe
                              that (1) {A} Z X sinceA ¢ E; and (2) {A} # B because |{A}| = 1 4 0 = |@I.

In order to speak collectively about the sets £°, ©!, £7, ... , we introduce the following
                              notation for unions of such sets.

Definition 6.3          If © is an alphabet, then

a) Xt = Ue”,        "=    U er   D";      and      b) X* = Ue        ~”.

We see that the only difference between the sets &* and X* is the presence of the element
                              d because A € D” only whenn = 0. Also D* = Et U L?,
                                 In addition to using the term string, we shall also refer to the elements of &*   or X* as
                              words and sometimes as sentences. For & = {0, 1, 2}, we find such words as 0, 01, 102,
                              and 1112 in both ©* and =*.
                                  Finally, we note that even though the sets &* and =* are infinite, the elements of these
                              sets are finite strings of symbols.
                                                                 6.1   Language: The Set Theory of Strings       311

For & = {0, 1} the set &* consists of all finite strings of 0’s and 1’s together with the
EXAMPLE 6.2
                 empty string. For n reasonably small, we could actually list all strings in }”.
                    If & = {B,0,1,2,...,9, +, —, x, /, ()}, where 6 denotes the blank (or space), it is
                 harder to describe &* and, for n > 2, there are too many strings to list in ©”. Here in &*
                 we find familiar arithmetic expressions such as (7 + 5)/(2 X (3 — 10)) as well as gibberish
                 such as +)((7/X + 3/(.

We are now confronted with a familiar situation. As with statements (Chapter 2), sets
                 (Chapter 3), and functions (Chapter 5), once again we need to be able to decide when two
                 objects under study —-in this case strings —-are to be considered the same. We investigate
                 this issue next.

Definition 6.4   If w1, wo € Xt, then we may write wy = x1xX2 +++ Xm and w2 = yy2--+ yn, form,n eZ,
                 and x1, X2,..., Xin, Yl, Y2,+++, ¥n © LX. We Say that the strings w, and w2 are equal, and
                 we write w) = w2, ifm =n, and x; = y; forall 1 <i<m.

It follows from this definition that two strings in &* are equal only when each is formed
                 from the same number of symbols from © and the corresponding symbols in the two strings
                 match identically.
                    The number of symbols in a string is also needed to define another property.

Definition 6.5   Let w = x,X2-++X, € E*, where x; € © for each 1 <i <n. We define the length of w,
                 which is denoted by ||w’||, as the value n. For the case of 2, we have ||A|| = 0.

As a result of Definition 6.5, we find that for any alphabet &, if w € =* and ||w|| > 1,
                 then w € £*, and conversely. Also, for all y € X*,        ||y|| = 1 if and only if y e ©. Should
                 x contain the symbol £ (for the blank), it is still the case that |||] = 1.
                     If we use a particular alphabet, say & = {0, 1, 2}, and examine the elementsx = 01, y =
                 212, and z = 01212 (in &*), we find that

[ZI] = ]01212|| =5 = 243 = {Ol + ]212|] = lx] + lly].
                    In order to continue our study of the properties of strings and alphabets, we need to
                 extend the idea of juxtaposition a little further.

Definition 6.6   Let x, ye   Xt   with x = x4x.-++ xX,    and y = y;ya-+-+ yx, SO that each x;, for 1 <i <m,
                 and each y;, for 1 < j <n, is in &. The concatenation of x and y, which we write as xy,
                 is the string x1.X2- + XmYiLYo°°° Yn.
                   The concatenation ofx and       is xA = x1 X2+- + XmA = X1X2 +--+ Xm        = X, and the concate-
                 nation ofA and x 18 Ax = AxyxX2 +++ Xm     = X1X2 +++ Xm = X. Finally, the concatenation of xr
                 and A is AA = 2X.

Here we have defined a closed binary operation on &* (and £1t). This operation is
                 associative but not commutative (unless     |&| = 1), and since xA = Ax = x for all x € X*,
                 the element A € &”* is the identity for the operation of concatenation. The ideas embodied
312          Chapter 6 Languages: Finite State Machines

in the last two definitions (the length of a string and the operation of concatenation) are
                              interrelated in the result

llxyll = Ilxll+lyll,        for allx, ye =",
                              from which we obtain the special case

Ill] = [Le] 4 0 = [fx]] + [Al] = [eal] (= [ax |).
                              Finally, for each z € ©, we have ||z|| = ||zA|| = ||Az|| = 1, whereas ||zz|| = 2.
                                  The closed binary operation of concatenation now leads us to another recursive definition.
                              Earlier we looked at powers of an alphabet &. Now we examine powers of strings.

Definition 6.7          For each x € &*, we define the powers ofx by x=)                 x) =x, x7 = xx, x8 = xx7,...,
                              x'tl = yx"...         wheren EN.

This definition is another illustration of how a mathematical entity is given in a recur-
                              sive manner: The mathematical entity we presently seek is derived from previously derived
                              entities. The definition provides a way for us to deal with the n-fold concatenation [an
                              (n + 1)st power] of a string as the concatenation of the string with its (7 — 1)-fold concate-
                              nation (an nth power). In so doing, the definition includes the special case where the string
                              is just one symbol.

If X& = {0, 1} and x = O01, then x® =A, x! = O01, x? = 0101, and x? = 010101. For all
      EXAMPLE 6.3
                              n > Q, x” consists of a string of n 0’s and n 1’s where the first symbol is 0 and the sym-
                              bols alternate. Here ||x7|| = 4 = 2]|x||, \|x?]] = 6 = 3||x||, and, forall n EN, ||x"|| = allx|.

We are just about ready to tackle the major theme of this section, the concept of a
                              language. Before we do so, however, we need to inquire about three other ideas. These
                              ideas involve special subsections of strings.

Definition 6.8          If x, y € X&* and w = xy, then the string x is called a prefix of w, and if y # A, then x is
                              said to be a proper prefix. Similarly, the string y is called a suffix of w; it is a proper suffix
                              when x # A,

Let © = {a, b, c}, and consider the string w = abbcc. Then each of the strings i, a, ab,
      EXAMPLE 6.4
                              abb, abbc, and abbcc is a prefix of w, and except for abbcc itself, each is a proper prefix.
                              On the other hand, each of the strings A, c, cc, bec, bbcc, and abbcc is a suffix of w, where
                              the first five strings are proper suffixes.
                                  In general, for an alphabet    X, if n € Z*       and x; € XZ, for all 1 <i   <n,   then each of
                              A, X1, X1X2, X1X2X3,..., and xyx2xX3-- + Xy, is a prefix of the stringx = xyx2x3 +--+ X,. And
                              A, Xn» Xn—-1Xn, Xn—-2Xn—-1Xn,-.., aNd x,xX2xX3 +++ xX, are all suffixes of x. So x has n+ 1
                              prefixes, x of which are proper —and          the situation is the same for suffixes.

If ||x|| = 5, | yl] =4, and w = xy, then w has x as a proper prefix and y as a proper suffix.
      EXAMPLE 6.5             In all, w has nine proper prefixes and nine proper suffixes because A is both a proper prefix
                              and a proper suffix for every string in =*+. Here xy is both a prefix and a suffix, but in
                              neither case is it proper.
                                                                   6.1 Language: The Set Theory of Strings   313

For a given alphabet ©, let w, a, b, c, d € &*. lf w = ab = cd, then
EXAMPLE 6.6
                     1) a isa prefix of c, orc is a prefix of a; and
                     2) bisa suffix of d, or d is a suffix of b.

Definition 6.9    If x, y, z € &* and w = xyz, then y 1s called a substring of w. When at least one ofx and
                  z is different from A (so that y is different from w), we call y a proper substring.

For & = {0, 1}, let w = 00101110 € &*. We find the following substrings in w:
EXAMPLE 6.7
                     1) 1011: This arises in only one way
                                                        — when             w = xyz, with x = 00, y = 1011, and
                        z= 10.
                     2) 10: This example comes about in two ways:
                        a) w = xyz wherex = 00, y = 10, andz = 1110; and
                        b)   w = xyz forx = 001011, y = 10, andz = 4.

In case (b) the substring is also a (proper) suffix of w.

Now that we are familiar with the necessary definitions, it is time to think about the
                  concept of language. When we consider the standard alphabet, including the blank, many
                  strings such as qxio, the wxxy red atzl, and aeyt! do not represent words or parts of sentences
                  that appear in the English language, even though they are elements of &*. Consequently, in
                  order to consider only those words and expressions that make sense in the English language,
                  we concentrate on a subset of &*. This leads us to the following generalization.

Definition 6.10   For a given alphabet ©, any subset of &* is called a language over X. This includes the
                  subset @, which we call the empty language.

With & = {0, 1}, the sets A = {0, 01, 001} and B = (0, 01, 001, 0001, .. .} are examples
EXAMPLE 6.8       of languages over =.

With & the alphabet of 26 letters, 10 digits, and the special symbols used in a given imple-
EXAMPLE 6.9       mentation of C++, the collection of executable programs for that implementation constitutes
                  a language. In the same situation, each executable program could be considered a language,
                  as could a particular set of such programs.

Since languages are sets, we can form the union, intersection, and symmetric difference
                  of two languages. However, for the work here, an extension of the closed binary operation
                  defined (in Definition 6.6) for strings is more useful.

Definition 6.11   For an alphabet © and languages A, B C &*, the concatenation of A and B, denoted AB,
                  is {abla € A, bE B}.
314         Chapter 6 Languages: Finite State Machines

We might compare concatenation with the cross product. We shall see that just as
                             A X B # BX A in general, we also have AB # BA in general. For A, B finite we did
                             have |A X B| = |B X A|, but here |AB| # |BA| is possible for finite languages.

Let & = {x, y, z}, and let A, B be the finite languages A = {x, xy, z}, B = {A, y}. Then
      EXAM      :
       XAMPLE 6.10           AB = {x, xy, z, xyy, zy} and BA = {x, xy, z, yx, yxy, yz}, so
                                 1) |AB| = 5 #6 =|BA|; and
                                 2) |AB| =5 #6=3-2=|A||Bl.

The differences arise because there are two ways to represent xy: (1) xy forx € A, ye B
                             and (2) xyA where xy € A and A € B. [The concept of uniqueness of representation is
                             something we cannot take for granted. Although it does not hold here, it is a key to the
                             success of many mathematical ideas. We saw this, for example, in the Fundamental Theorem
                             of Arithmetic (Theorem 4.11).]

The preceding example suggests that for finite languages A and B,|AB| < |A|| B|. This
                             can be shown to be true in general.

The following theorem deals with some of the properties satisfied by the concatenation
                             of languages.

THEOREM 6.1                  For an alphabet X, let A, B, C C ©*. Then

a) A[A}={AJA= A                              b) (AB)C = A(BC)
                               c) A(BUC) = ABUAC                            d) (BUC)A=BAUCA
                               e) A(BNC)C ABN AC                            f) (BNC)ACBANCA
                             Proof: We prove parts (d) and (f) and leave the other parts for the reader.
                                 (d) Since we are trying to prove that two sets are equal, once again we use the idea of
                             set equality that we first found in Definition 3.2. Starting with x in &* we find that
                             xE€(BUC)A>x = yzforye BUCandz€                   A> (x = yzfory€ B,z € A)or(x = yz
                             for ye C,zEe€A)SxXxEBA           or x ECA   SxEBAUCA,          so   (BUC)AC      BAUCA.
                             Conversely, it follows thatx € BAUCAS>xeBAorx              € CAS     (x = ba, wherebe     B
                             and a, € A) or (x = cay where c €C       and a) € A). Assume x = ba,    for be B, a; € A.
                             Since B C BUC,        we have x = ba,, where b € B UC anda,   € A. Thenx   € (BUC)A,     so
                             BAUCA C(BUC)A. (The argument is similar ifx = ca2.) With both inclusions estab-
                             lished, it follows that (B UC)A = BAUCA.
                                 (f) For  x € &*, we see that x € (BN C)A =x = yz where ye BNC and ze A>
                             (x = yzfor    y € Bandz   € A) and (x = yzforyeCandze A) xe BAandx € CAS
                             x €BANCA, so(BNC)AC BANCA.
                                 With & = {x, y, z}, let B = {x, xx, y}, C = {y, xy}, and A = {y, yy}. Then xyyeé
                             BANCA, but xyy ¢(BNC)A. Consequently, (B 1 C)A C BAN CA for these partic-
                             ular languages.

Comparable to the concepts of 5", X*, &*, the following definitions are given for an
                             arbitrary language A C &*.
                                                                     6.1 Language: The Set Theory of Strings       315

Definition 6.12   For a given language A C &* we can construct other languages as follows:
                       a) A® = {A}, A! = A, and for alln € Z*, A"*! = {abla € A, b € A").
                       b) At    =   U ez    A", the positive closure of A.
                       c) A* = At U {A}. The language A” is called the Kleene closure of A, in honor of the
                          American logician Stephen Cole Kleene (1909-1994).

If & = {x, y, z} and A = {x}, then (1) A® = {A}; (2) A” = {x"}, for eachn € N; (3) At           =
   EXAMPLE 6.11      {x"|n > 1}; and (4) A* = {x"|n > QO}.

EXAMPLE 6.12. | ‘t™ =            9)
                       a) If A = {xx, xy, yx, yy} = £2, then A* is the language of all strings w in X* where
                          the length of w is even.
                       b) With A as in part (a) and B = {x, y}, the language B A* contains all the strings in X*
                          of odd length. In this case we also find that BA* = A*B and that &* = A* U BA*.
                       c) The language {x}{x, y}* (the concatenation of the languages {x} and {x, y}*) contains
                          every string in ©* for which x is a prefix. The language {x}{x, y}7 (the concatenation
                          of the languages {x} and {xy}*) contains every string in &* for which x is a proper
                          prefix.
                              The language containing all strings in £* for which yy is a suffix can be defined
                          by {x, y}*{yy}.
                               Every string in the language {x, y}*{xxy}{x, y}* has xxy as a substring.
                       d) Each string in the language {x}*{y}* consists of a finite number (possibly zero) of
                          x’s followed by a finite number (also possibly zero) of y’s. And although {x}*{y}* €
                          {x, y}*, the string w = xyx isin{x, y}* but notin {x}*{y}*. Hence {x}*{y}* Cc {x, y}*.

In the algebra of real numbers, if a, b € R and a, b > 0, then a” = b* +a           = b. However,
   EXAMPLE 6.13 |    in the case of languages, if © = {x, y}, A = {A, x, x°, x4,...} = fx"|n > O} — {x7} and
                     B = {x"|n > 0}, then A? = B?(=        B), but A # B. (Note: We never have 4 € &, but it is
                     possible to have 4 € A C E*.)

We continue this section with a lemma and a second theorem that deal with the properties
                     of languages.

LEMMA 6.1            Let & be an alphabet, with languages A, B C X*. If A C B, then for alln € Z*, A" CB".
                     Proof: Since A! = AC B = B’, it follows that the result is true in the case for n = 1.
                     Assuming the truth form = k, we have AC         B => A* c B*. Now        consider a string x from
                     A*+t! From part (a) of Definition 6.12 we know thatx = x,x,,wherex;           € A, x, € AX. IFAC
                     B then A‘ C BF (by the induction hypothesis), and we have x; € B, x, € B*. Consequently,
                     x = xx, € BB* = B**! and A**! ¢ B*!, By the Principle of Mathematical Induction, it
                     now follows that if A C B, then for all n € Z*, A” C B".
316         Chapter 6 Languages: Finite State Machines

Note: Lemma       6.1 does not establish that A* C B*   or that A* C B*. These results are
                             part of our next theorem.

THEOREM 6.2                  For an alphabet © and languages A, B C D*,

a) AC      AB*                                 b) AC B*A
                                ec) ACBSAt+CcBt                               d) AC B=      A* Cc B*
                               e) AA* = A*A = At                              f) A*A* = A* = (A*)* = (A*)t = (At)
                               g) (AU B)* = (A*U BY)" = (ATB*Y*
                             Proof: We provide the proofs for parts (c) and (g).
                                 (c) Let AC Bandx € A*. Thenx € At 3x € A", forsomen € Z*. From Lemma 6.1
                             it then follows that x ¢ B" C Bt, and we have shown that A* C Bt.
                                 (g) [(A U B)* = (A* U B*)*]. We know that A C A*, B C B* 3 (AU B) C (A*U B*)
                             = (AU B)* Cc (A* U B*)* [by part (d)]. Conversely, we also see that A, BC AUBS>
                             A*, B* C (AU B)* [by part (d)] => (A* U B*) C (AU B)* = (A* U B*)* C (A U B)* [by
                             parts (d) and (f)]. From both inclusions it follows that (A U B)* = (A* U B*)*.
                                 [(A* U B*)* = (A* B*)*].     First we find that A*, B* C A*B*|by      parts (a) and (b)] >
                             (A* U B*) C A* B* = (A* U B*)* C (A*B*)*          [by part (d)]. Conversely,     if xy € A*B*
                             where x € A* and ye B*, then x, ye A*U B*, so xy € (A*U B*)*,                     and A*B*C
                             (A* U B*)*. Using parts (d) and (f) again, (A*B*)* C (A* U B*)*, and              so the result
                             follows.

As we close this first section we further examine the idea of a recursively defined set
                             (given in Section 4.2), as demonstrated in the following three examples.

For the alphabet & = {0, 1} consider the language A C &* where each word in A contains
      EXAMPLE 6.14
                             exactly one occurrence of the symbol 0. Then A is an infinite set, and among the words in
                             A one finds 0, 01, 10, 01111, 11110111, and 11111111110. There are also infinitely many
                             words in &* that are not in A—-such as, 1, 11, 00, 000, 010, and 011111111110. We can
                             define this language A recursively as follows:

1) Our base step tells us that 0 € A; and
                                2) For the recursive process we want to include in A the words 1x and x1, for each word
                                   xeA,

Using this definition, the following discussion shows us that the word 1011 is in A.
                                From part (1) of our definition, we know that 0 € A. Then by applying part (2) of our
                             definition — three times — we find:
                                  i)    01 <¢ A, because 0 € A;
                                 ii) O11 € A, because 01 € A; and
                                iii)    Since O11 € A, we have 1011   € A.

For & = {(,)}—the alphabet containing the left and right parentheses — we want to con-
      EXAMPLE 6.15
                             sider the language A € &* consisting of those nonempty strings of parentheses that are
                             grammatically correct for algebraic expressions. Hence we find, for example, the three
                             strings (( )), ((( ) ())), and (.) (.) () in this language, but we do not find strings such as
                             (()€), 0) (C), or )C (C ))). We see that ifa string x(# A) is to be in A, then
                                                                               6.1 Language: The Set Theory of Strings          317

i) we must have the same number of left parentheses in x as there are right parentheses;
                                    and
                                it) the number of left parentheses must (always) be greater than or equal to the number
                                    of right parentheses, as we examine each of the parentheses in x —reading them
                                    consecutively from left to right.

The language A may be given recursively as follows:

1) () isin A; and
                                 2) For all x, y € A we have (i) xy € A and (ii) (x) € A.

[As we mentioned prior to Example 4.22, we also have an implicit restriction here     — that
                             no string of parentheses is in A unless it can be derived through steps (1) and (2) above.]
                                 Using this recursive definition, the following shows us how to establish that the string
                             ({ )()) in &* is in the language A.

Steps                       Reasons
                                  1) ()isin A.                Part (1) of the recursive definition
                                  2) ()() isin A.             Step (1) and part (21) of the definition
                                  3) (( { )) isin A.          Step (2) and part (211) of the definition

Given an alphabet     &, consider the string x = x,x2x3 +++ X,—)X,         in &* where x; € X for
    EXAMPLE 6.16
                             each 1 <i <n      and neZ".      The reversal of x, denoted x*, is the string obtained from
                             x by reading the symbols (in x) from right to left— that is, x® = x,x,~1 «+» x3x2X1. For
                             example, if © = {0, 1} and x = 01101, then x* = 10110 and for w = 101101 we find
                             w® = 101101 = w. In general, we can define the reversal of a string (from £*) recursively
                             as follows:
                                 1) A® =); and
                                 2) Foreachn € N,ifx € ©"*!, then wecan writex = zy wherez € © and y € ©” —and
                                    here we define x* = (zy)* = (y¥)z.
                                Using this recursive definition we shall now prove that if © is an alphabet and x, x2 € D*,
                             then (x}.x2)* = xR xk,
                             Proof: Here the proof is by mathematical induction — on the value of ||.x, | . If || | = 0, then
                             xy = A and (xyx2)% = (Ax2)® = x8 = xR = xRAR = xf x® because A* = A from part (1)
                             of the recursive definition. Consequently, the result is true in this first case and this establishes
                             the basis step. For the inductive step we shall assume the result is true for all y, x2 € &*
                             where || y|| = & forsomek € N. Now consider what happens for x;, x2 € &* wherex,;               = zy,
                             with ||z|| = 1 and ||y;|| = &. Here we find that (x,x2)* = (zy,.x2)* = (y1x2)*z [from part
                             (2) of the recursive definition] = xk yfz (from the induction hypothesis) = xk (zy1)* [again
                             by part (2) of the recursive definition] = xRXR     Therefore the result is true for all x,, x. € X*
                             by the Principle of Mathematical Induction.

2. For X = {w, x, y, z} determine the number of strings in
                       ee          Se                              >* of length 5 (a) that start with w; (b) with precisely two w’s;
                                                                   (c) with no w’s; (d) with an even number of w’s.
1. Let & = {a, b,c, d, e}. (a) What is |=*|? |X7]? (b) How                              3
                                                                    3. Ifx € X* and ||x°|| = 36, what is ||x||?
many strings in &* have length at most 5?
318                Chapter 6 Languages: Finite State Machines

4. Let & = {B, x, y, z} where 8 denotes a blank, so xB # x,              iii)    The reversal function: r(A) = A; for x € Xt, if x =
BB # B, and xBy # xy but xAy = xy. Compute each of the                            XjX_+++Xy—1X,, Where x; € Y for all 1 <i <n, then
following:                                                                        V(X) = X,Xp_—1 ++ X2X, = X* (as defined in Example
                                                                                  6.16).
      a) |All                 b) ||AAl                 ¢) ||8|l
                                                                           iv)    The front deletion function: for x €¢ X*, ifx =
      d) ||BA|                e) ||6°||                f) ||xBBy|
                                                                                  X1X2X3-+++X,, then d(x) = x12x3 +++ Xp.
      g) ||8A|I               h) ||A""|
                                                                           a) Which of these four functions is (or are) one-to-one?
5. Let D = {v, w, x, yz} and A= U"_, o>". How many
                                                                          b) Determine which of these four functions is (or are) onto.
strings in A have xy as a proper prefix?
                                                                          If a function is not onto, determine its range.
  6. Let & be an alphabet. Let x; € © for 1 <i < 100 (where
                                                                           c) Are any of these four functions invertible? If so, deter-
x, # x, for all 1 <i <j < 100). How many nonempty sub-
                                                                          mine their inverse functions.
strings are there for the string s = x12 -- + X99?
                                                                          d) Suppose that © = {a, e, i, o, uw}. How many wordsx in
7. For the alphabet © = {0, 1}, let A, B, C C X* be the fol-
                                                                          ©? satisfy r(x) = x? How many in £°? How many in 2’,
lowing languages:
                                                                          where n € N?
             A = {0, 1, 00, 11, 000, 111, OOOO, 1111},
                                                                           e) Forx € &*, determine
             B= {we X*|2 < |wll},
                                                                                       (do py)(x)       and         (rodoros,){x).
             C= {we X*|2> ||w||}.                                         f) If     © ={a,e,i,0,u}            and      B=   {ae, ai, ao, oo, eio,
Determine the following subsets (languages) of £*.                        eiouu} C L*, findr—'(B), p7'(B), s,'(B), and |d~'(B)|.
      a) ANB                  b) A—B                   c) AAB       17. If A(¢ ) is a language and A? = A, prove that A = A*.
      d) ANC                  e) BUC                   f) (ANC)     18. Provide the proofs for the remaining parts of Theorems 6.1
8. Let A = {10, 11}, B = {00, 1} be languages for the al-          and 6.2,
phabet & = {0, 1}. Determine each of the following: (a) AB;         19. Prove       that for al]     finite languages       A, B C d*, |AB| <
(b) BA; (c) A®; (d) B?.                                             |A|| B].
  9. If A, B,C, and D are languages over L, prove that              20. For & = {x, y}, use finite languages from L* (as in Ex-
(a) (ACBACCD)=>      AC CBD; and (b) ADV  = GA = &.                 ample 6.12), together with set operations, to describe the set
10. For & = {x, y, z}, let A, B C D* be given by A = {xy}           of strings in &* that (a) contain exactly one occurrence of x;
and B = {A, x}. Determine (a) AB; (b) BA; (c) B?; (d) Bt;           (b) contain      exactly   two   occurrences        of x;   (c) begin    with x;
(e) A*.                                                             (d) end in yxy; (e) begin with x or end                     in yxy      or both;
11. Given an alphabet &, is there a language A C ©* where           (f) begin with x or end in yxy but not both.
A* =A?                                                              21.   For & = {0, 1}, let A C 5* be the language defined recur-
12. For & = {0, 1} determine whether the string 00010 is in         sively as follows:
each of the following languages (taken from £*).                          1) The symbols 0, 1 are both in A — this is the base for our
      a) {0, 1}*                          b) (000, 101}{10, 11}           definition; and
      c) {00}{0}" {10}                    d) {000}"{1}" {0}               2) For each word x in A, the word Ox1 is also in A — this
                                                                          constitutes the recursive process.
      e) {O0}*{10}"                       £) (O}*{1}*{0}*
13. For & = {0, 1} describe the strings in A* for each of the             a) Find four different words              — two of length 3 and two of
following languages A C =*.                                               length 5—in A.
      a) {01}                             b) {000}                        b) Use the given recursive definition to show that 0001111
      c) {0, 010}                         d) {1, 10}                      isin A.

14. For & = {0, 1} determine all possible languages A, B C                c) Explain why 00001111 is not in A.
&* where AB = {01, 000, 0101, 0111, 01000, 010111}.                 22. Provide a recursive definition for each of the following lan-
15. Given a nonempty language A C ©*, prove that if A? = A,         guages A C &* where & = {0,               1}.
thendA EA.                                                                a) x € Aif (and only if) the number of 0’s in x is even.
16. For a given alphabet ©, let a © & — with a fixed. Define              b) x € A if (and only if) all of the 1’s in x precede all of
the functions p,, S,,r: &* — &* andthe functiond: Xt > X*                 the 0’s.
as follows:                                                         23. Use the recursive definition given in Example 6.15 to verify
       i) The prefix (by a) function: p,(x) = ax, x € D*.           that each of the following strings is in the language A of that
      ii) The suffix (by a) function: s,(x) = xa, x € X*.           example.
                                                                                       6.2. Finite State Machines: A First Encounter           319

a) (C(O)              b) (OO                 ce) (0)                    27. For & = {0, 1}, let A, B C &*, where A          is the language of
24. For an alphabet © a string x in &* is called a palindrome               all strings in ©* of even length, while B is the language of all
ifx = x*® — that   is, x is equal to its reversal. If A C &* where          strings in £* of odd length. Give a recursive definition for each
A= {x € X*|x = x*}, how can we define the language A re-                    of the languages A, B.
cursively?                                                                  28. Let © = {a, b, c}. Determine the smallest number of words
25. For& = {0, 1}, let A C &*, where A = {00,         1}. How many          one must select from £4 to guarantee that at least two of the
strings in A* have length 3? length 4? length 5? length 6?                  words start and end with the same letter.

26. For © = {0, 1}, let AC X*, where            A = {00, 111}. How
many strings in A* have length 19?

6.2
  Finite State Machines: A First Encounter
                                We return now to the vending machine mentioned at the start of this chapter and analyze it
                                in the following circumstance.
                                    Ata metropolitan office, a vending machine dispenses two flavors of chewing gum (each
                                flavor in a package of five pieces): peppermint (P) and spearmint (S). The cost of a package
                                of either flavor is 20¢. The machine accepts nickels, dimes, and quarters and returns the
                                necessary change. One day Mary Jo decides she’d like a package of peppermint-flavored
                                chewing gum. She goes to the vending machine, inserts two nickels and a dime, in that order,
                                and presses the white button, denoted W. Out comes her package of peppermint-flavored
                                chewing gum. (To get a package of spearmint-flavored chewing gum one presses the black
                                button, denoted B.)
                                    What Mary Jo has done, in making               her purchase, can be represented as shown in Table 6.1,
                                where fg is the initial time, when she inserts her first nickel, and f;,           tf, f3, t4 are later moments
                                in time, with ft; < fh <ft3 < ty.

Table 6.1

to                ty                  th                    b                ty
                                      State        (1) 50              (4) 5) (5¢)       | (7) s2 10g) | (10) 53 (20¢) | (13) so
                                      Input        (2) 5¢              (5) 5¢               (8)   10¢           (11) W
                                      Output | (3)       Nothing | (6)       Nothing | (9)        Nothing | (12)       P
                                    The numbers (1), (2), ..., (12), (13) in this table indicate the order of events in the purchase of Mary Jo's
                                    package of peppermint chewing gum. For each input at time z,, 0 <i < 3, there is at that time acorresponding
                                    output and then a change in state. The new state at time 7,4; depends on both the input and the (present)
                                    state at time ¢,.

The machine is in a state of readiness at state so. It waits for a customer to start inserting
                                coins that will total 20¢ or more and then press a button to get a package of chewing gum.
                                If at any time the total of the coins inserted exceeds 20¢, the machine provides the needed
                                change (before the customer presses the button to get the package of chewing gum).
                                    At time tg Mary Jo provides the machine with her first input, 5¢. She receives nothing
                                at this time, but at the later time ft; the machine is in state s;, where it remembers her total
                                of 5¢ and waits for her second input (of 5¢ at time ¢,;). The machine again (at time ¢))
                                provides no output, but at the next time, fy, it is in state s2, remembering a total of 10¢ =
                                5¢ (remembered at state s;) + 5¢ (inserted at time ¢,). Providing her dime (at time 4) as
320          Chapter 6 Languages: Finite State Machines

the next input to the machine, Mary Jo does not receive a package of chewing gum at this
                              time because the machine doesn’t “know” which flavor Mary Jo prefers, but it does “know”
                              now (f3) that she has inserted the necessary total of 20¢ = 10¢ (remembered at state s2) +
                              10¢ (inserted at time fz). At last Mary Jo presses the white button, and at time fz the machine
                              dispenses the output (her package of peppermint chewing gum) and then returns, at time f4,
                              to the starting state so, just in time for Mary Jo’s friend Rizzo to deposit a quarter, receive
                              her nickel change, press the black button, and obtain the package of spearmint chewing
                              gum she desires. The purchase made by Rizzo is analyzed in Table 6.2.

Table 6.2

to                 ty       h
                                                    State      (1) 59            (4) 53 (20¢) | (7) 50
                                                     Input     (2) 25¢           (5) B
                                                     Output | (3)   S5¢ change | (6)   S

What has happened in the case of this vending machine can be abstracted to help in the
                              analysis of certain aspects of digital computers and telephone communication systems.
                                 The major features of such a machine are as follows:
                                  1) The machine can be in only one of finitely many states at a given time. These states
                                     are called the internal states of the machine, and at a given time the total memory
                                     available to the machine is the knowledge of which internal state it is in at that
                                     moment.
                                  2) The machine will accept as input only a finite number of symbols, which collectively
                                     are referred to as the input alphabet §. In the vending machine example, the input
                                     alphabet is {nickel, dime, quarter, W, B}, each item of which is recognized by each
                                     internal state.
                                  3) An output and a next state are determined by each combination of inputs and internal
                                     states. The finite set of al! possible outputs constitutes the output alphabet © for the
                                     machine.
                                  4) We assume that the sequential processings of the machine are synchronized by sepa-
                                     rate and distinct clock pulses and that the machine operates in a deterministic manner,
                                     where the output is completely determined by the total input provided and the starting
                                     state of the machine.
                                  These observations lead us to the following definition.

Definition 6.13         A finite state machine is a five-tuple M = (S, $, C, v, w), where S = the set of internal
                              states for    WM; ¥ = the input alphabet for M; © = the output alphabet for M; v: S X # > $
                              is the next state function; and w: S X ¥ — CO is the output function.

Using the notation of this definition, if the machine is in state s at time ¢; and we input
                              x at this time, then the output at time /; is w(s, x). This output is followed by a transition
                              of the machine at time #;.; to the next internal state given by v(s, x).
                                 We assume that when a finite state machine receives its first input, we are at time fo = 0
                              and the machine is in a designated starting state denoted by sp. Our development will
                                                                    6.2       Finite State Machines: A First Encounter             321

concentrate primarily on the output and state transitions that take place sequentially, with
               little or no reference to the sequence of clock pulses at times fg, t), fo, -. .
                   Since the sets S$, #, and © are finite, it is possible to represent v and w, for a given finite
               state machine, by means of a table that lists v(s, x) and w(s, x) forall s € Sandallx ef.
               Such a table is referred to as the state table or transition table for the given machine. A
               second representation of the machine is made by means of a state diagram.
                   We demonstrate the state table and state diagram in the following examples.

Consider the finite state machine        M     = (S, $, C, v, w), where                  S = {s9, 5), 52},   J =C     =
EXAMPLE 6.17
               {O, 1}, and v, w are given by the state table in Table 6.3. The first column of the table lists
               the (present) states for the machine. The entries in the second row are the elements of the
               input alphabet #, listed once under v and then again under w. The six numbers in the last
               two columns (and last three rows) are elements of the output alphabet ©.

Table 6.3

yp                         @

0            1               0       1

SQ     50           5]               0       0
                                                 Sj     S52          St               0       0

$2     So           Sy              0        1

To calculate v(s,, 1), for example, we find s, in the column of present states and proceed
               horizontally over from s; until we are below the entry 1 in the section of the table for v.
               This entry gives v(s;, 1) = s;. In the same way we find w(s,,                          1) = 0.
                   With so designated as the starting state, if the input provided to M is the string 1010,
               then the output is 0010, as demonstrated in Table 6.4. Here the machine is left in state s>,
               so that if we had another input string, we would provide the first character of that string,
               here Q, at state sz unless the machine is resef to start once again at sg.

Table 6.4

State     SO                v(so, Ll) = sy | v(sy, 0) = 52 | v(s2, 1) = 5) |                       v(s;, 0) = 52

Input      1                0                            1                       0                 0

Output | w(so, 1) = 0 | w(s;,0) =0 | @(s2, 1) =1 | ws}, 0) =O

Since we are primarily interested in the output, not in the sequence of transition states,
               the same machine can be represented by means of a state diagram. Here we can obtain the
               output string without actually listing the transition states. In such a diagram each internal
               state s is represented by a circle with s inside of it. For states s; and s;, if v(s;, x) = s;
               for x ¢ J, and w(s,, x) = y for y € C, we represent this in the state diagram by drawing a
               directed edge (or arc) from the circle for s, to the circle for s; and labeling the arc with the
               input x and output y as shown in Fig. 6.1.
                   With these conventions, the state diagram for the machine M of Table 6.3 is shown in
               Fig. 6.2. Although the table is more compact, the diagram enables us to follow an input
               string through each transition state it determines, picking up each of the corresponding
322     Chapter 6 Languages: Finite State Machines

Figure 6.1

output symbols before each transition. Here if the input string is 00110101, then starting at
                         state so, the first input of 0 yields an output of 0 and returns us to so. The next input of 0
                         yields the same result, but for the third input, 1, the output is 0 and we are now in state 5}.
                         Continuing in this manner, we arrive at the output string 00000101 and finish in state sy.
                         (We note that the input string 00110101 is an element of $*, the Kleene closure of #, and
                         that the output string is in ©*, the Kleene closure of ©.)
                             Starting at so, what is the output string for the input string 1100101101?

Figure 6.2

| EXAMPLE 6.18           For the vending machine described earlier in this section, we have the state table, Table 6.5,
                          with
                             1) S = {so, 51, 82, 53, 54}, where at state s;, for each 0 < k < 4, the machine remembers
                                  retaining 5k cents.
                             2) J = {5¢, 10¢, 25¢, B, W}, where B denotes the black button one presses for a pack-
                                  age of spearmint-flavored chewing gum and W                  the white button for a package of
                                  peppermint-flavored chewing gum.
                             3) © = {n (nothing), P (peppermint chewing gum), S (spearmint chewing gum), 5¢,
                                10¢, 15¢, 20¢, 25¢}.

Table 6.5

y                                        @
                                              5¢      10¢       25¢      B          WwW   5¢     10¢     25¢      B    WwW

SO        S|      s2        S4       50         SO    n       n          5¢   n    n
                                    Sy        $2      53        S4       Sy        S]     n        n     10¢      n    n
                                    S2        53      54        S4       52         $2    n        n     15¢      n    n
                                    $3        $4      S4        S4       $3         $3    n       5¢     20¢      n    n
                                    S4        S4      S4        54       SO         SO    5¢     10¢     25¢      S    P

As we observed in the discussion just prior to Example 6.18, for a general finite state
                          machine M          = (S, ¥, ©, v, w), the input can be realized as an element of #*, with the output
                                                                            6.2   Finite State Machines: A First Encounter          323

from C*. Consequently, it is to our advantage to extend the domains of v and w from S X F
                 to S X $*. For w we enlarge the codomain to ©*, recalling, should the need arise, that
                 both $* and ©* contain the empty string, A. With these extensions, if x).x2-- +» x; € $*, for
                 k € Z*, then starting at any state s; € S, we have

v(s}, X1) = 89!
                                  V(81, X1X2) = v(v(sy, X1), X2) = v(s2, X2) = 83
                               V(S1, X1X2X3) = v(v(v(s1, X1), X2), X3) = v(93, x3) = 54
                                                       '      re,    nena         |

So

V(S2, X2) = 83

VCS], X1XQ +++ Xe) = VCS, XK) = S41,                          and
                                    @(S},X1) = yy
                                 (S|, XyX2) = w(S], X1JO(V(S1, X1), X2) = O(81, X1)@(S2, X2) = yiy2
                              @(S1, X1X2X3) = WS}, X1)ew(S2, X2)H(53, X3) = Yr y2¥3

WS}, X1X2-++- XR) = (51, X1)@(S2, X2) +> + WCSK, XE) = Vi Yr2-°- Ye € O*
                 Also, v(s,, A) = s; forall s,; € S.
                 (We shall use these extensions again in Chapter 7.)
                    We close this section with an example that 1s relevant in computer science.

EXAMPLE   6.19   Let x = x5x4x3X%2X;       = 00111         and y = ys5y4y3y2y;               = 01101        be binary numbers   where x;
                 and y, are the least significant bits. The leading 0’s in x and y are there to make the strings
                 for x and y of equal length and to guarantee enough places to complete the sum. A serial
                 binary adder is a finite state machine that we can use to obtain x + y. The diagram in
                 Fig. 6.3 illustrates this, where z = 252423222, has the least significant bit z).

X = X5X4X3X2X1——>                   Serial
                                                                              binary = -}——® Z = 25242322)
                                          Y = YsYaY3¥2¥i ——>                  adder
                                          Figure 6.3

In the addition z = x + y, we have

x=            00         1 1          1

+y=+0                       ]
                                                              Z=                       1     0      0

third          first
                                                                                  addition       addition

We note that for the first addition x; = y,; = 1 and z, = 0, whereas for the third addition
                 we have x3 = y3 = | and z3 = | because of a carry from the addition of x2 and y2 (and the

' The state 52 is determined by s; and x. It is not simply the second in a predetermined list of states.
324           Chapter 6 Languages: Finite State Machines

carry from x; + y1). Consequently, each output depends on the sum of two inputs and the
                                ability to remember a carry of 0 or 1, which is crucial when the carry is 1.

The serial binary adder is modeled by a finite state machine M = (S, $, ©, v, w) as
                                follows. The set S$ = {so, 51}, where s; indicates a carry ofi; # = {00, 01, 10, 11}, so there
                                is a pair of inputs, depending on whether we are seeking 0+0,0+1,1+4+0, or 141,
                                respectively; and © = {0, 1}. The functions v, w are given in the state table (Table 6.6) and
                                the state diagram (Fig. 6.4).

Table 6.6

00        =—sO01      10       11 | 00           =~O1         10       11
                                                     so | So          SO       SO          s; | 0            1            ]        0
                                                     Sy     SQ        5S}      Ss]         S]        1       0            0

—"
                                   In Table 6.6 we        find, for example,           that v(s;, 01) = s; and w(s;, 01) = 0, because s;
                                indicates a carry of 1 from the addition of the previous bits. The 01 input indicates that we
                                are adding 0 and 1 (and carrying a 1). Hence the sum is 10 and w(s;, 01) = 0 for the Oin
                                10. The carry is again remembered in s; = v(s;, 01).
                                    From the state diagram (Fig. 6.4) we see that the starting state must be sg because there
                                is no carry prior to the addition of the least significant bits.

Start
                                                                     Figure 6.4

The state diagrams in Figs. 6.2 and 6.4 are examples of labeled directed graphs. We
                                shall see more about graph theory throughout the text, for it has applications not only in
                                computer science and electrical engineering but also in coding theory (prefix codes) and
                                optimization (transport networks).

Table 6.7
                         EXERCISES 6.2
                                                                                                                 v                     @
1. Using the finite state machine of Example 6.17, find the out-
put for each of the following input strings x € $*, and determine                                        a    bee              a       be
the last internal state in the transition process. (Assume that we                               So | So     s3         S2|O           1     1
always start at sy.)
                                                                                                 Ss, } 8)    Ss;        83   |O        O     1
   a) x = 1010101       —_—b) x = 1001001    c) x = 101001000                                    ois         s;         #311           1    0
2. For the finite state machine of Example 6.17, an input string                                 53 | S       83        So | 1         O     1
X, Starting at state sy, produces the output string 00101. Deter-
mine x.
3. Let M = (S, ¥, ©, v, w) be a finite state machine where                     a) Starting at sy, what               is the output for the input string
S = {S0, 51, $2, 83},  = fa, b, c}, C = {O, 1}, and v, w are de-               abbccc?
termined by Table 6.7.                                                         b) Draw the state diagram for this finite state machine.
                                                                                 6.2   Finite State Machines: A First Encounter          325

4. Give the state table and the state diagram for the vending           8. Let M = (S, ¥, ©, v, w) be a finite state machine with # =
machine of Example 6.18 if the cost of a package of chewing             © = {0, 1} and S, v, and w determined by the state diagram
gum (peppermint or spearmint) is increased to 25¢.                      shown in Fig. 6.7.
5. A finite state machine M = (S, , C, v,w) has $ =O =
{O, 1} and is determined by the state diagram shown in Fig.
6.5.

Start

Figure 6.5
                                                                                Figure 6.7
       a) Determine the output string for the input string 110111,
       starting at sy. What is the last transition state?
                                                                           a) Find the output for the input string x = 0110111011.
       b) Answer part (a) for the same string but with s; as the
                                                                           b) Give the transition table for this finite state machine.
       starting state. What about s2 and s3 as starting states?
                                                                           c) Starting in state so, if the output for an input string x is
       c) Find the state table for this machine.
                                                                           0000001, determine all possibilities for x.
       d) In which state should we start so that the input string          d) Describe in words what this finite state machine does.
       10010 produces the output 10000?
                                                                        9. a) Find the state table for the finite state machine in Fig. 6.8,
       e) Determine an input string x € $* of minimal length, such
                                                                           where# = © = {0, 1}.
       that v(s4, x) = 5s). 1s x unique?
6. Machine M has ¥ = {0, 1} = € and is determined by the
state diagram shown in Fig. 6.6.

1,0            0,1

Figure 6.6

a) Describe in words what this finite state machine does.
       b) What must state s; remember?
       c) Find two languages A, B C $* such that for every x €               Figure 6.8
       AB, w(sp, x) has 1 as a suffix.
                                                                           b) Letx € $* with ||x|| = 4. If1 is a suffix of w(sy, x), what
7. a) If S, , and € are finite sets, with |S| = 3, |¥| =5, and
                                                                           are the possibilities for the string x?
   |\O| = 2, determine (i) |S X $|; (ii) the number of func-
   tions v: § X § > S; and (iii) the number of functions                   c) Let A € {0, 1}* be the language where w(so, x) has | as
   wo SXF > €.                                                             a suffix for all x in A. Determine A.

b) For S, ¥, and € in part (a), how many finite state machines      d) Find the language A C {0, 1}* where @ (so, x) has 111 as
       do they determine?                                                  a suffix for all x in A.
326         Chapter 6 Languages: Finite State Machines

6.3
Finite State Machines: A Second Encounter
                             Having seen some examples of finite state machines, we turn to the study of some additional
                             machines that are relevant to the design of computer hardware. One important type of
                             machine is the sequence recognizer.

Here, § = © = {0, 1}, and we want to construct a machine that recognizes each occurrence
      EXAMPLE 6.20
                             of the sequence 111 as it is encountered in an input string x € ¥*. For example, if x =
                             1110101111,     then the corresponding output should be 0010000011,         where a 1 in the ith
                             position of the output indicates that a 1 can be found in positions 7, 7 — 1, andi — 2 of x.
                             Here overlapping of sequences of 111 can occur, so some characters in the input string can
                             be thought of as characters in more than one triple of 1’s.
                                Letting so denote the starting state, we realize that we must have a state to remember
                             1 (the possible start of 111) and a state to remember 11. In addition, any time our input
                             symbol is 0, we go back to sg and start the search for three successive 1’s over again.
                                 In Fig. 6.9, s; remembers      a single 1, and s. remembers the string 11. If s2 is reached,
                             then a third “1” indicates the occurrence of the triple in the input string, and the output 1
                             recognizes this occurrence. But this third “1” also means that we have the first two 1’s of
                             another possible triple coming up in the string (as happens in 11101011 “1” 1). So after
                             recognizing the occurrence of 111 with an output of 1, we return to state sz to remember
                             the two inputs of1 “1”.

Figure 6.9

If we are concerned with recognizing all strings that end in 111, then for each x € ¥*,
                             the machine will recognize such a sequence with final output 1. This machine is then a
                             recognizer of the language A = {0, 1}*{111}.

Another finite state machine that recognizes the same triple 111 is shown in Fig. 6.10.
                             The finite state machines represented by the state diagrams in Figs. 6.9 and 6.10 perform

Figure 6.10
                                                        6.3   Finite State Machines: A Second Encounter    327

the same task and are said to be equivalent. The state diagram in Fig. 6.10 has one more
               state than that in Fig. 6.9, but at this stage we are not overly concerned with getting a finite
               state machine with a minimal number of states. In Chapter 7 we shall develop a technique to
               take a given finite state machine M and find one that is equivalent to it and has the smallest
               number of internal states needed.

The next example is a bit more selective.

Now we want to not only recognize the occurrence of 111 but we want to recognize only
EXAMPLE 6.21
               those occurrences that end in a position that is a multiple of three. Consequently, with # =
               © = {0, 1}, ifx € ¥*, wherex = 1110111, then we want w(so,.x) = 0010000, not 0010001.
               In addition, for x € %*, where x = 111100111, the output w(so, x) is to be 001000001, not
               001100001, for here, because of length considerations, overlapping of sequences of 111 is
               not allowed.
                  Again we start at so (Fig. 6.11), but now s; must remember a first 1 only if it occurs in x
               in position 1, 4, 7, .... If the input at so is 0, we cannot simply return to sp as in Example
               6.20. We must remember that this 0 is the first of three symbols of no interest. Hence from
               So We go to s3 and then to s4, processing any triple of the form Oyz where 0 occurs in x in
               position 3k + 1, k => 0. The same type of situation happens at s; if the input is 0. Finally, at
               s2 the sequence 111 is recognized with an output of 1, if it occurs. The machine then returns
               to sg to input the next symbol of the input string.
                                                                   1,1

Start

Figure 6.11

Figure 6.12 shows the state diagrams for finite state machines that will recognize the occur-
EXAMPLE 6.22
               rence of the sequence 0101 in an input string x € J*, where § = C = {0, 1}. The machine
               in Fig. 6.12(a) recognizes with an output of 1 each occurrence of 0101 in an input string, re-
               gardless of where it occurs. In Fig. 6.12(b) the machine recognizes with an output of 1 only
               those prefixes of x whose length is a multiple of four and that end in 0101. (Hence no over-
               lapping is allowed here.) Consequently, for x = 01010100101, w(so, x) = 00010100001
               for (a), whereas for (b), @(so, x) = 00010000000.

Now that we have examined some finite state machines that serve as sequence recogniz-
               ers, itis only fair to consider a set of sequences that cannot be recognized bya finite state
               machine. This example gives us another opportunity to apply the pigeonhole principle.

Let § = € = {0, 1}. Can we construct a finite state machine that recognizes precisely those
EXAMPLE 6.23
               strings in the language A = {01, 0011, OOO111, ...} = {0'l'|i € Z*}? If we can, thenif so
328   Chapter 6 Languages: Finite State Machines

(a)                                                       (0)
                       Figure 6.12

denotes the starting state, we shall expect w(so, 01) = 01, w(so, 0011) = 0011, and, in
                       general, (59, 0'1') = 0'1', foralli € Z*. [Note: Here, forexample, we want w(so, 0011) =
                       0011, where the first 1 in the output is for recognition of the substring 01 and the second 1
                       is for recognition of the string 0011.]
                           Suppose that there is a finite state machine M = (S, $, ©, v, w) that can recognize
                       precisely those strings in A. Let so € S, where so is the starting state, and let |S] =n > 1.
                       Now consider the string 0”*'1”*! in the language A. If our machine M is to operate correctly,
                       then we want w(so, 0"+!1"*') = 0"+'1"+1. Therefore, we see in Table 6.8 how this finite
                       state machine will process the n + 1 0’s, starting at the state so, then continuing at the n
                       states s] = v(so, 0), 52 = v(s;, 0), ..., ands, = v(s,_1, 0). Since |S] = n, by applying the
                       pigeonhole principle to the n + 1 states so, 51, 82, ..., Sn—1, Sn, We realize that there are
                       two states s; and s, wherei < j but s; = s;.
                                       Table 6.8

State    So | S31 | Sa]... | Say | Sn | Saga           |e. . | Son | Sona

Input    0O/0}0]...             0       0      ]       Lee       1      1

Output | 0/0]        0]...       0       0      1       Lae       1      1

Now in Table 6.9 we see how the removal of the j — i columns
                                                                                      — for                       states Sitly eves
                      s; —results in Table 6.10. This table shows us that the finite state machine M recognizes
                      the string x = O"F)-U)
                                          1" 41, where n + 1 — (j —i) <n +1. Unfortunately x ¢ A, so
                      M recognizes a string that it is not supposed to recognize. This demonstrates that we cannot
                      construct a finite state machine that recognizes precisely those strings in the language
                      A = {0'l'|i € Z*}.

Table 6.9

State      So]         St]   52)      ees   |S | Sei    jee. | Sy | Sta    | eee | Se | Saga | ee. | San | Sanat

Input    | 0 | 0|            O             0     0            0       0              0         1              I     1
         Output | 0 | 0 |               0             0     0            0       0              0         1              l     1
                                                          6.3   Finite State Machines: A Second Encounter             329

Table 6.10

State     So | Si | Sop...     | Se | Sper     | eee | Sn | Sati        [e+ | San | S2nqi

Input | 0 | 0 | O|}        ... | 0         0      we. | O        1      Lae      1        1

Output | 0;      0]   O|}   ... | 0         0      ... | O        1      we. |             ]

A class of finite state machines that is important in the design of digital devices consists
               of the k-unit delay machines, where k € Z*. For k = 1, we want to construct a machine M
               such that ifx = x)x2 +++ Xm—1Xm, then for starting state 59, w(so, x) = Ox) x2 +++ Xm-—1, SO
               that the output is the input delayed one time unit (clock pulse). [The use of 0 as the first
               symbol in w(so, x) is conventional.]

Let # = O = {0, 1}. With starting state sg, w(so, x) = 0 for x = 0 or 1 because the first
EXAMPLE 6.24   output is QO; the states s; and s2 (in Fig. 6.13) remembera prior input of 0 and 1, respectively.
               In the figure, we label, for example, the arc from s; to sy with 1, 0 because with an input of
               1 we need to go to s2 where inputs of | at time ¢; are remembered so that they can become
               outputs of 1 at time ¢;,). The 0 in the label 1, 0 is the output because starting in s; indicates
               that the prior input was 0, which becomes the present output. The labels on the other arcs
               are obtained by the same type of reasoning.

Figure 6.13

Observing the structure of a one-unit delay, we extend our ideas to the two-unit delay
EXAMPLE 6.25   machine shown in Fig. 6.14. Ifx € #*, let x = x;x2---X where m > 2; if sg is the starting
               state, then w(so, x) = 00x) - + x,—2. For states sg, 51, 52 the output is 0 for all possible
               inputs. States 53, 54, 85, and sg must remember         the two prior inputs 00, 01,         10, and    11,
               respectively. To get the other arcs in the diagram, we shall consider one such arc and then
               use similar reasoning for the others. For the are from s5 to 53 in Fig. 6.14(a), let the input
               be 0. Since the prior input to s5 from sz is 0, we must go to the state that remembers the
               two prior inputs 00. This is state s3. Going back two states from 55 to sz to 59, we see that
               the input is 1 (from sg to 52). This then becomes the output (delayed two units) for the arc
               from ss to s3. The complete machine is shown in part (b) of Fig. 6.14.
330            Chapter 6 Languages: Finite State Machines

(a)
                               Figure 6.14

We turn now to some additional properties that arise in the study of
                                                                                                       finite state machines,
                               The machine in Fig. 6.15 will be used for examples of the terms defined.

Definition 6.14         Let M = (S, ¥, ©, v, w) be a finite state machine.

a) For s;,5; € S,s j 18 said to be reachable from 5; if s; = s j or if there
                                                                                                               is an input string
                                    x € $* such that v(s;, x) = s;. (In Fig. 6.15, state s3 is reachable
                                                                                                            from SQ, S$}, S2, and
                                         s3 but not from s4, 55, 56, or s7. No state is reachable from
                                                                                                       53 except s3 itself.)
                                 b) Astate s € Sis said to be transient
                                                                     if v(s, x) = sforx € $* impliesx = A; that is, there
                                    isno x € $* with v(s, x) = 5s. (For the machine in Fig. 6.15, so
                                                                                                     is the only transient
                                         state.)

Figure 6.15
                                                             6.3   Finite State Machines: A Second Encounter     331

c) Astate s € S$ is called a sink, or sink state, if v(s, x) = s, forall x € $*. (s3 1s the only
                           sink in Fig. 6.15.)
                    d) Let S$; CS, 9; CF. If vy = vis,x9,: Sy X F; > S (that is, the restriction of v to
                       S, X $; CS X F)has its range within S$, then with w; = @|s,x5,, M1 = (Si, 91, €,
                           v;, @;) is called a submachine of M. (With S, = {s4, 55, 56, 87}, and #; = {0, 1}, we
                           get a submachine M, of the machine M in Fig. 6.15.)
                     e) A machine is said to be strongly connected if for any states s;, s; € S, 5; 1s reachable
                        from s;. (The machine in Fig. 6.15 is not strongly connected, but the submachine M,
                        in part (d) has this property.)

We close this section with a concept that uses a tree diagram.

Definition 6.15   For a finite state machine M, let s;, s; be two distinct states in $. An input string x € $+ is
                  called a transfer (or transition) sequence from s; to s; if

a) v(s;, x) = s;, and

b) y € F* with v(s;, y) = 5; > llyll = Ile.

There can be more than one such (shortest) sequence for two states s;, $j.

Find a transfer sequence from state sg to state s2 for the finite state machine M given by the
EXAMPLE 6.26      state table in Table 6.11, where # = © = {0, 1}.

So

Table 6.11

v               @

0         |      0       1
                     SO        S56       Ss]     0       ]

5]        S55       50      0       ]

S2        Sy        $2      0       1
                     53        S4        So      0       1
                     S4        S2        Sq      0       1

55        §3        S5      ]       ]
                     S56       S53       S6      1       1             .
                                                                      Figure 6.16

In constructing the tree diagram of Fig. 6.16, we start at state sy and find those states
                  that can be reached from sg by using strings of length 1. Here we find s; and s¢. Then we
                  do the same thing with s; and s¢, finding, as a result, those states reachable from sp with
                  input strings of length 2. Continuing to expand the tree from left to right, we get to a vertex
                  labeled with the desired state, s.. Each time we reach a vertex labeled with a state used
                  previously, we terminate that part of the expansion because we cannot reach any new states.
                  After we arrive at the state we want, we backtrack to sy and use the state table to label the
                  branches, as shown in Fig. 6.16. Hence, forx = 0000, v(sg, x) = s2 with w(sg, x) = 0100.
                  (Here x is unique.)
332                 Chapter 6 Languages: Finite State Machines

Table 6.12
                                  EXERCISES 6.3
                                                                                                                                 v                  @
1. Let § = G = {0, 1}. (a) Construct a state diagram for a fi-
                                                                                                                            0           1
nite state machine that recognizes each occurrence of 0000 in
a string x € $*. (Here overlapping is allowed.) (b) Construct                                            SQ                SOQ         5]
a State diagram for a finite state machine that recognizes each                                          Ss]               So          S|
string x € $* that ends in 0000 and has length 4k, k € Z*. (Here
overlapping is not permitted.)
                                                                                          c) Describe in words what machine M does.
2. Answer Exercise 1 for each of the sequences 0110 and 1010.
                                                                                          d) How is this machine related to that shown in Fig. 6.13?
3. Construct a state diagram for a finite state machine with
                                                                                       6. Show that it is not possible to construct a finite state ma-
S$ =C = (0, 1} that recognizes all strings in the language
                                                                                       chine that recognizes precisely those sequences in the lan-
{O, 1}*{00}U {0, 1}* {11}.
                                                                                       guage A = {0'l/|i, j € Z*,i > j}. (Here the alphabet for A
4, For # = © = {0, 1} astringx € J* is said to have even parity                        is & = {0, 1}.)
if it contains an even number of 1’s. Construct a state diagram
                                                                                       7. For each of the machines in Table 6.13, determine the tran-
for a finite state machine that recognizes all nonempty strings
                                                                                       sient states, sink states, submachines (where #, = {0, 1}), and
of even parity.
                                                                                       strongly connected submachines (where ¥; = {0, 1}).
5. Table 6.12 defines v and w for a finite state machine M where
                                                                                       8. Determine a transfer sequence from state s; to state ss in
FS = 6 = {0, I}.
                                                                                       finite state machine (c) of Exercise 7. Is your sequence unique?
   a) Draw the state diagram for M.
   b) Determine the output for the following input sequences,
   starting at Sy in each case: (i) x = 111; (ii) x = 1010;
   (iit) x = 0001).

Table 6.13

v                    w®                        v                @                                         y                   @

0         1           0        1              0         ]       0        1                              0           1          0       1

SO        S4        S|          0        0         50   SO        S|       1       0                  SO          S|         52          0       1
               5]        S48                   0        ]         S]   So.       SY      0        ]                  S]          So         82          1       1
               S2        S3        S55         0        0         S2   S|        S53     0        0                  S52         S52        53          1       1

$3        S285                  1        0         $3   So        Sg      0        O                  $3          So         S4          0       0
               54        S4        S54         1        1         S4   54        S4       1       ]                  S54         S5         S55         1       0

S5        S2        S53         0        1                                                            S5          S3         S4          ]       0

So | S&                S&S |       O       O
         {a)                                                (b)                                                (c)

6.4
       Summary and Historical Review
                                           In this chapter we have been introduced to the theory of languages and to a discrete structure
                                           called a finite state machine. Using our prior development of elementary set theory and finite
                                           functions, we were able to combine some abstract notions and to model digital devices
                                           such as sequence recognizers and delays. Comparable coverage of this material appears in
                                                                    6.4 Summary and Historical Review           333

Chapter 1 of L. L. Dornhoff and FE. Hohn [3] and in Chapter 2 of D. F. Stanat and D. F.
                McAllister [15].
                   The finite state machine we developed is based on the model put forth in 1955 by G. H.
                Mealy in [11] and is consequently referred to as the “Mealy machine.” The model is based
                on earlier concepts found in the work of D. A. Huffman [8] and E. F Moore [13]. For further
                reading on the pioneering work dealing with various aspects and applications of the finite
                state machine, consult the material edited by E. F Moore [14]. Additional information
                on the actual synthesis of such machines and on related hardware considerations, along
                with an extensive coverage of many related ideas, can be found in Chapters 9-15 of
                Z. Kohavi [9].
                   For more on languages and their relation to finite state machines, one should look into
                the UMAP module by W. J. Barnier [1], Chapter 8 of J. L. Gersting [4], and Chapters 7
                and 8 of A. Gill [5]. A comprehensive coverage of these (and related) topics is given in the
                texts by J. G Brookshear [2], J. E. Hopcroft and J. D. Ullman [7], H. R. Lewis and C. H.
                Papadimitriou [10], M. Minsky [12], and D. Wood [16].

David Hilbert                                                Alan Mathison Turing (1912-1954)
(1862-1943)                                           Reproduced courtesy of The Granger Collection, New York

One may be surprised to learn that the basic ideas of automata theory were developed to
                solve rather theoretical questions in the foundations of mathematics — as posed in 1900 by
                the German mathematician David Hilbert (1862-1943), In 1935 the English mathematician
                and logician Alan Mathison Turing (1912-1954) became interested in Hilbert’s decision
                problem, which asked if there could be a general method one could apply to a given state-
                ment in order to determine if that statement were true. Turing’s approach to the solution of
                this problem led him to develop what is now known as a Turing machine, the most general
                model for a computing machine. By using this model, he was able to establish very pro-
                found theoretical results about how computers should have to operate— before any such
                machines were actually built. During World War II Turing worked for the Foreign Office at
                Bletchley Park, where he did extensive work on the cryptanalysis of Nazi ciphers. His ef-
                forts contributed to the breaking of the mechanical cipher machine Enigma, a breakthrough
                that helped to bring about the defeat of the Third Reich. Following the war (and up to the
                time of his death), Turing’s interest in the ability of machines to think led him to play a
                major role in the development of actual (not just theoretical) computers. For more on the
                life of this interesting scholar one should look into the biography by A. Hodges [6].
334           Chapter 6 Languages: Finite State Machines

REFERENCES
                                     1. Barnier, William J. “Finite-State Machines as Recognizers” (UMAP Module 671). The UMAP
                                         Journal 7, no. 3 (1986): pp. 209-232.
                                     2. Brookshear, J. Glenn. Theory of Computation: Formal Languages, Automata, and Complexity.
                                         Reading, Mass.: Benjamin/Cummings, 1989.
                                       . Dornhoff, Larry L., and Hohn, Franz E. Applied Modern Algebra. New York: Macmillan, 1978.
                                       . Gersting, Judith L. Mathematical Structures for Computer Science, 5thed. New York: Freeman,
                                         2003.
                                       . Gill, Arthur. Applied Algebra for the Computer Sciences, Prentice-Hall Series in Automatic
                                        Computation. Englewood Cliffs, N.J.: Prentice-Hall, 1976.
                                       . Hodges, Andrew. Alan Turing: The Enigma. New York: Simon and Schuster, 1983.
                                       . Hopcroft, John E., and Ullman, Jeffrey D. Introduction to Automata Theory, Languages, and
                                         Computation. Reading, Mass.: Addison-Wesley, 1979.
                                       . Huffman, D. A. “The Synthesis of Sequential Switching Circuits.” Journal of the Franklin
                                         Institute 257 (March 1954): pp. 161-190, (April 1954): pp. 275-303. Reprinted in Moore
                                         [14].
                                       . Kohavi, Zvi. Switching and Finite Automata Theory, 2nd ed. New York: McGraw-Hill, 1978.
                                       . Lewis, Harry R., and Papadimitriou, Christos H. Elements of the Theory of Computation, 2nd
                                        ed. Englewood Cliffs, N.J.: Prentice-Hall,     1997.
                                     11. Mealy, G. H. “A Method for Synthesizing Sequential Circuits.” Bell System Technical Journal
                                         34 (September 1955): pp. 1045-1079.
                                     12. Minsky, Marvin. Computation: Finite and Infinite Machines. Englewood Cliffs, N.J.: Prentice-
                                         Hall, 1967.
                                     13. Moore, E. F. “Gedanken-experiments on Sequential Machines.” Automata Studies, Annals of
                                        Mathematical Studies, no. 34: pp. 129-153. Princeton, N.J.: Princeton University Press, 1956.
                                     14. Moore, E. F., ed., Sequential Machines: Selected Papers. Reading, Mass.: Addison-Wesley,
                                         1964.
                                     15. Stanat, Donald F., and McAllister, David F. Discrete Mathematics in Computer Science. En-
                                        glewood Cliffs, N.J.: Prentice-Hall,   1977.
                                     16. Wood, Derick. Theory of Computation. New York: Wiley, 1987.

SUPPLEMENTARY EXERCISES
                                                                        4. For © = {0,1} consider the languages A, B,C C £*
                                                                       where A = {01,11}, B = {01, 11, 111}, and C = {01, 11,
                                                                       1111}. (a) How are A* and B* related? (b) How about A* and
                                                                       c*?
1, Let    ©; ={w,x, y}     and   X=      {x, y,z}   be   alphabets.
                                                                         5. Let M be the finite state machine shown in Fig. 6.17. For
If A, ={x'y/|i, fEZ*, jf >i > 1), Ao ={w'y/ |i, fj EZ,                 states s,,s,, where 0 <i, j <2, let ©,, denote the set of all
i> j> 1}, Az ={wixiy'2/|i, fe Zt, j>i> 1}, and Ay =                    nonempty output strings that M can produce as it goes from state
{z/(wz)'w   |i, j €Z*,i > 1, j => 2}, determine whether each
                                                                       s, to state s,. Ifi = 2, 7 = 0, for example, Cry = {O}{1, OO}*.
of the following statements is true or false.
                                                                           Find Co2, ©22, O11, Ooo, and Oro.
      a) A, is a language over ©).
      b) A: is a language over Xp.
      c) A; is a language over X; U Xo.
      d) A, is a language over £11 Xp.
      e) Aq is a language over ©; A Xp.
      f) A, U A: is a language over X).
2. For languages A, B C &*, does A* C B* > ACB?
3. Give anexample ofa language A over an alphabet ©, where
(A*)* # (At.                                                                                   Figure 6.17
                                                                                                                   Supplementary Exercises                                    335

6. Let M be the finite state machine in Fig. 6.18.                     10. With # = © = {0, 1}, let M be the finite state machine given
                                                                       in Table 6.15. Here sy is the starting state. Let A C $* where
                                                                       x € Aifand only if the last symbol in (sp, x) is 1. [There may
                                                                       be more than one 1 in the output string @(sy, x).] Construct a
                                                                       finite state machine wherein the last symbol of the output string
                                                                       is 1 forall ye $* — A.

Table 6.15
                                                                                                                    y                         @

0                  ]             0               1

SO           S|              S52              ]               0

Sy           82                 S|            0               l
          Figure 6.18                                                                      52           S2                 S53           0               1
                                                                                           S53          S|                 SO            1               0
    a) Find the state table for this machine.
    b) Explain what this machine does.
                                                                       11. Let # = © = {0, 1} for the two finite state machines M,
    c) How many distinct input strings x are there such that           and M2, given in Tables 6.16 and 6.17, respectively. The start-
    |x|] = 8 and v(sy,x) = so? How many are there with                 ing state for M) is so, whereas s3 is the starting state for M).
    I|x|| = 12?
  7, Let M =(S, $, 6, v, w)       be a finite state machine     with   Table 6.16                                                Table 6.17
|\S| =n, and letOe §.
                                                                                      vy               @)                                           v2                   @2
    a) Show that for the input string 0000... , the output is
    eventually periodic.                                                       O            1     {0

oS


                                                                                                             —

OQ
    b) What is the maximum number of 0’s we can input before            SQ     So          Sy     1                                S3         S3             S4     1          1
                                                                                                             OOS

the periodic output starts?                                         Ss; | Ss}          So | O                                  54 | So                   S311              O
    c) What       is the length of the maximum       period that can    52     S52         So     0
                                                                                                             eS

occur?
8. For § = C = {0, 1}, let M be the finite state machine given
in Table 6.14. If the starting state for M is not s,, find an in-           We connect these machines as shown in Fig. 6.19. Here
put string x (of smallest length) such that v(s;, x) = 5), for all
                                                                       each output symbol from M, becomes an input symbol for
i = 2,3, 4. (Hence x gets the machine M to state s; regardless         M),. For example, if we input 0 to Mj), then @)(59, 0) = 1 and
of the starting state.)                                                v1 (59, O) = sy. As a result, we then input 1 (= @) (so, 0)) to M2
                                                                       to get @2(53, 1) = 1 and v2(s3, 1) = 54.
         Table 6.14
                              v                 7)
                                                                                           —>                MM,                        My,   -—>
                         0        1        0          1

S]         S4       53       0          0                                    Figure 6.19
              S2         S2       S4       0          ]
                                                                           We construct a machine M = (S, #, ©, v, w) that represents
              53         5}       $2       1          0                this connection of M, and M> as follows:
              S4         S|       S4       1          1

J = € = {0, 1}.
  9, Let § = © = {0, 1}. Construct a state diagram for a finite                      S = S$, X So,                 where S, is the set of internal
state machine that reverses (from 0 to 1 or from 1 to 0) the                                                       states for M,, fori             =     1, 2.
symbols appearing in the 4th, in the 8th, in the 12th, ..., posi-
                                                                               viSX$—+>S,                          where
tions of an input string x € $+. For example, if sp is the starting
state, then w(sp, 0000) = 0001, (sy, 000111) = 000011, and             v((s, t), x) = (vi(s, x), v(t, a 6s, x),                                   fors € 5), 1 € Ss,
w(sy, 000000111) = 000100101.                                                                                                                     andx € §.
336             Chapter 6 Languages: Finite State Machines

w: SX $—+€,       where                                      suggests the use of a matrix or two-dimensional array for stor-
w((s, t), x) = @o(t,
                 a (s,x)),           fors
                                      € $,,t €          S,,andx € §&.   ing v, w. : Use this observation
                                                                                               _         to write a Eee
                                                                                                                    program (or develop
                                                                        an algorithm) that will simulate the machine in Table 6.18.
      a)   Find a state table for machine M.                                     Table 6.18
      b) Determine the output string for the input string 1101.
      After this string is processed, in which state do we find                                    v                 @
      (i) machine M,? (ii) machine M2?                                                        0         I        0       1

12. Although the state diagram seems more convenient than
the state table when we are dealing with a finite state machine                      51       $2       S]        0       0
M =(S, §, ©, v, w), as the input strings get longer and the sizes                    $2       53       5]        0       0
of S, J, and © increase, the state table proves useful when sim-                     $3       $3       S}        1       1
ulating the machine on a computer. The block form of the table
   Relations: The
Second Time Around

I" Chapter 5 we introduced the concept of a (binary) relation. Returning to relations in this
                             chapter, we shall emphasize the study of relations on a set A — that is, subsets of A X A.
                          Within the theory of languages and finite state machines from Chapter 6, we find many
                          examples of relations on a set A, where A represents a set of strings from a given alphabet
                          or a set of internal states from a finite state machine. Various properties of relations are
                          developed, along with ways to represent finite relations for computer manipulation. Directed
                          graphs reappear as a way to represent such relations. Finally, two types of relations on a set
                          A are especially important: equivalence relations and partial orders. Equivalence relations,
                          in particular, arise in many areas of mathematics. For the present we shall use an equivalence
                          relation on the set of internal states in a finite state machine M in order to find a machine
                          M,, with as few internal states as possible, that performs whatever tasks M is capable of
                          performing. The procedure is known as the minimization process.

71
              Relations Revisited:
            Properties of Relations
                          We start by recalling some fundamental ideas considered earlier.

Definition 7.1        For sets A, B, any subset of A X B is called a (binary) relation from A to B. Any subset
                          of A X A is called a (binary) relation on A.

As mentioned in the sentence following Definition 5.2, our primary concern is with
                          binary relations. Consequently, for us the word “relation” will once again mean binary
                          relation, unless something otherwise is specified.

|   EXAMPLE 71       |        a) Define the relation ® on the set Z by a KR b, or (a, b) € KR, tf a < b. This subset of
             "                   Z X Zis the ordinary “less than or equal to” relation on the set Z, and it can also be
                                 defined on Q or R, but not on C.
                              b) Letn € Z*. For x, y € Z, the modulo n relation R is defined by x R y ifx — yisa
                                 multiple of n. With n = 7, we find, for instance, that 9 R 2, —3 R 11, (14, 0) € R, but
                                 3 R 7 (that is, 3 is not related to 7).

337
338           Chapter 7 Relations: The Second Time Around

c) For the universe U = {1, 2,3, 4,5, 6, 7} consider the (fixed) set C C % where
                                    C = {1, 2, 3, 6}. Define the relation R on POU) by AR B when ANC = BNC.
                                    Then the sets {1, 2, 4, 5} and {1, 2, 5, 7} are related since {1, 2, 4,5} MC     = {1, 2} =
                                     {1, 2,5, 7} AC. Likewise we find that X = {4, 5} and Y = {7} are so related because
                                    XMC      =%=     Y ONC. However,
                                                                 the sets S = {1, 2, 3,4, 5} and T = {1, 2, 3, 6, 7} are
                                    not related
                                            — that is, SAT —since SOC = (1, 2,3} 4 {1,2,3,6} =TNC.

|      EXAMPLE 7.2             Let & be an alphabet, with language A C &*. For x, y € A, define x R y if x is a prefix
                               of y. Other relations can be defined on A by replacing “prefix” with either “suffix” or “‘sub-
                               string.”

Consider a finite state machine M       = (S, #, ©, v, w).
|      EXAMPLE 7.3
                                 a) For s1, so € S, define s; & s2 if v(s}, x) = s, for some x € F. Relation R establishes
                                    the first level of reachability.
                                 b) The relation for the second level of reachability can also be given for S. Here 5; R 5 if
                                     v(s1, X1X2) = $9, for some x1.x2 € ¥. This can be extended to higher levels if the need
                                    arises. For the general reachability relation we have v(s}, y) = s2, for some y € #*.
                                 c) Given s,, 52 € S the relation of /-eguivalence, which is denoted by s; E; sz and is
                                    read “s; is 1-equivalent to s”, is defined when w(s;, x) = w(s2, x) for all x € J.
                                    Consequently, s; E; sz indicates that if machine M starts in either state s; or sz, the
                                    output is the same for each element of #. This idea can be extended to states being
                                    k-equivalent, where we write s; Ex s2 if w(s1, y) = w(s2, y), forall y € $*. Here the
                                    same output string is obtained for each input string in $* if we start at either 51 or sp.
                                        If two states are k-equivalent for all k € Z*, then they are called equivalent. We
                                     shall look further into this idea later in the chapter.

We now start to examine some of the properties a relation can satisfy.

Definition 7.2          Arelation & ona set A is called reflexive if for allx € A, (x, x) ER.

To say that a relation & is reflexive simply means that each element x of A is related
                               to itself. All the relations in Examples 7.1 and 7.2 are reflexive. The general reachability
                               relation in Example 7.3(b) and all of the relations mentioned in part (c) of that example
                               are also reflexive. [What goes wrong with the relations for the first and second levels of
                               reachability given in parts (a) and (b) of Example 7.37]

For A = {1, 2, 3, 4}, a relation & CA X A will be reflexive if and only if R > {(1, 1),
       EXAMPLE 7.4
                               (2, 2), (3, 3), (4, 4)}. Consequently, R; = {(1, 1), (2, 2), (3, 3)} is not a reflexive relation
                               on A, whereas Ry = {(x, y)|x, y € A, x < y} is reflexive on A.

| EXAMPLE 7.5              Given a finite set A with |A| =,
                               How many of these are reflexive?
                                                                       we have |A X A| =n’,     so there are 2” relations on A.

If A = {a}, a2,..., @,}, arelation R on A          is reflexive if and only if {(a;, a;)|1 <i <
                               n} CR. Considering the other n? — n ordered pairs in A X A [those of the form (a;, a;),
                                                                   71        Relations Revisited: Properties of Relations              339

wherei     # j for 1 <i, j <n] as we construct a reflexive relation R on A, we either include
                                                             .                                                          2               .
                 or exclude each of these ordered pairs, so by the rule of product there are 2" ~” reflexive
                 relations on A.

Definition 7.3   Relation & on set A is called symmetric if (x, y)                    Ee R => (y, x) ER,       forall x, ye       A.

With A = {1, 2, 3}, we have:
EXAMPLE 7.6
                   a) KR, = {(1, 2), (2, 1), 1, 3), G, 1)}, asymmetric, but not reflexive, relation on A;
                   b) AR, = {(1, 1), (2, 2), (3, 3), (2, 3)}, a reflexive, but not symmetric, relation on A;
                   c) R3 = {(1, 1), (2, 2), (3, 3)}          and        Ry = {C1, 1), (2, 2), (3, 3), (2, 3), G, 2)},                  two
                         relations on A that are both reflexive and symmetric; and
                   d) Rs; = {(1, 1), (2, 3), (3, 3)}, a relation on A that is neither reflexive nor symmetric.

To    count    the   symmetric   relations          on     A = {a}, a@,...,@,},          we    write      A XA          as
                 A; U Az, where A;        = {(a;, a;)|1 <i       <n}         and A> = {(a;, a;)|1 <1,j7 <n,i                # j}, so that
                 every ordered pair in A X A is inexactly one of A,, Az. For Az, |A2| = |A X A] — |A;| =
                 n? —n = n(n — 1), an even integer. The set A> contains (1/2)(n* — n) subsets S;; of the
                 form {(a;, 4;), (a,, a;)} where 1 <i < j <n. Inconstructing a symmetric relation & on A,
                 for each ordered pair in A; we have our usual choice of exclusion or inclusion. For each of
                 the (1/2)(n? — n) subsets S, j(1 <i < j <n) taken from A2 we have the same two choices.
                 So by the rule of product there are 2” . 21/2" -”) = 2(1/2)"+") symmetric relations on A.
                    In counting those relations on A that are both reflexive and symmetric, we have only
                 one choice for each ordered pair in A;, So we have 2/2)"                             relations on A that are both
                 reflexive and symmetric.

Definition 7.4   For a set A, arelation ® on A is called transitive if, for all x, y,z€                       A, (x, vy), GY, DER
                 => (x, z) €R. (So ifx “is related to” y, and y “is related to” z, we want x “related to” z,
                 with y playing the role of “intermediary.’’)

All the relations in Examples         7.1       and 7.2 are transitive,             as are the relations in Ex-
EXAMPLE 7.7
                 ample 7.3(c).

Define the relation R on the set Zt by a Rb if a (exactly) divides b — that is, b = ca for
EXAMPLE 7.8
                 somec € Zt. Now ifx Ry and y R z, do we have x R z? We know thatx Ry > y =sx
                 forsome s € Z* and y R z => z = ty wheret € Z*. Consequently, z = ty = t(sx) = (ts)x
                 for ts € Z*, sox Rz and K is transitive. In addition, K is reflexive, but not symmetric,
                 because, for example, 2 R 6 but 6 fF 2.

Consider the relation R on the set Z where we define a R b when ab > 0. For all integers
EXAMPLE 7.9
                 x we have xx = x? > 0,sox Rx and         is reflexive. Also, if x, y € Zand x R y, then

xRysxy>O0>s>yx2>0>ayRx,
340          Chapter 7 Relations: The Second Time Around

so the relation & is symmetric as well. However, here we find that (3, 0), (0, -7) Ee R—
                             since (3)(0) > 0 and (0)(—7) > 0— but (3, —7) ¢ RK because (3)(—7) < 0. Consequently,
                             this relation is not transitive.

If A = {1, 2, 3, 4}, then R, = {(1, 1), (2, 3), (3, 4), (2, 4)} is a transitive relation on A,
      EXAMPLE 7.10
                              whereas A = {(1, 3), (3, 2)} is not transitive because (1, 3), (3, 2) € Rz but (1, 2) d Ro.

At this point the reader is probably ready to start counting the number of transitive
                             relations on a finite set. But this is not possible here. For unlike the cases dealing with the
                             reflexive and symmetric properties, there is no known general formula for the total number
                             of transitive relations on a finite set. However, at a later point in this chapter we shall have
                              the necessary ideas to count the relations ® on a finite set, where & is (simultaneously)
                             reflexive, symmetric, and transitive.

For now we consider one last property for relations.

Definition 7.5         Given a relation ® ona set A, R is called antisymmetric if for all a, b € A, (a Rb and
                             bR a) => a = b. (Here the only way we can have both a “related to” b and b “related to”
                             ais ifa and b are one and the same element from A.)

For a given universe U, define the relation R on PU) by (A, B) ER if ACB, for
      EXAMPLE 7.11
                             A, BCU. So &R is the subset relation of Chapter 3 and if A&R B and B RK A, then we have
                             A C B and B CA, which gives us A = B. Consequently, this relation is antisymmetric, as
                             well as reflexive and transitive, but it is not symmetric.

Before we are led astray into thinking that “not symmetric” is synonymous with “anti-
                              symmetric’, let us consider the following.

ForA = {1, 2, 3}, the relation  @ on A givenby R = {(1, 2), (2, 1), (2, 3)} is not symmetric
      EXAMPLE 7.12
                              because (3, 2) ¢ RK, and it is not antisymmetric because (1, 2), (2, 1) € R but | ¥ 2. The
                              relation R, = {(1, 1), (2, 2)} is both symmetric and antisymmetric.
                                  How many relations on A are antisymmetric? Writing

AXA={(,          1), @, 2), GB, 3)}U (G, 2), 2, D, C, 3), B,D, 2, 3), GB, 2)},
                              we make two observations as we try to construct an antisymmetric relation & on A.
                                 1) Each element (x, x) €¢ A X A can be either included or excluded with no concern
                                    about whether or not & is antisymmetric.
                                 2) For an element of the form (x, y), x # y, we must consider both (x, y) and (y, x)
                                    and we note that for & to remain antisymmetric we have three alternatives: (a) place
                                    (x, y) in R; (b) place (y, x) in &; or (c) place neither (x, y) nor (y, x) in R. [What
                                    happens if we place both (x, y) and (y, x) in R?]
                                                                7.1 Relations Revisited: Properties of Relations       341

So by the rule of product, the number of antisymmetric relations on A is (23)(3°) =
                  (23)(3-3)/2)_ If [A] =n > O, then there ate (2”)(3”'~”?/2) antisymmetric relations on A.

For our next example we return to the concept of function dominance, which we first
                  defined in Section 5.7.

EXAMPLE 7.13 _|   Let #     denote   the set of all functions   with domain       Z*    and    codomain
                  {flf: Z* — R}. For f, g € &, define the relation R on ¥ by f KR g if f is dominated by g
                                                                                                           R; that is, # =

(or f € O(g)). Then & is reflexive and transitive.
                     If f, g: Z* — Rare defined by f(n) =n and g(n) =n +5, then f Reg and g R f but
                  f #8, 80 R is not antisymmetric. In addition, if h: Z* > R is given by h(n) = n?, then
                  (f, h), (g, A) € KR, but neither (h, f) nor (A, g) is in KR. Consequently, the relation R is
                  also not symmetric.

At this point we have seen the four major properties that arise in the study of relations.
                  Before closing this section we define two more notions, each of which involves three of
                  these four properties.

Definition 7.6    A relation & on a set A is called a partial order, or a partial ordering relation, if R is
                  reflexive, antisymmetric, and transitive.

The relation in Example 7.1 (a) is a partial order, but the relation in part (b) of that example
EXAMPLE 7.14
                  is not because it is not antisymmetric. All the relations of Example 7.2 are partial orders, as
                  is the subset relation of Example 7.11.

Our next example provides us with the opportunity to relate this new idea of a partial
                  order with results we studied in Chapters 1 and 4.

We start with the set A = {1, 2, 3, 4, 6, 12}— the set of positive integer divisors of 12 —
EXAMPLE 7.15
                  and define the relation R on A by x K y if x (exactly) divides y. As in Example 7.8 we
                  find that & is reflexive and transitive. In addition, if x, y ¢ A and we have both x & y and
                  y Rx, then
                                              xR y => y = ax, forsomea ¢ Z*,                  and
                                              yRx=>x       = by, forsomebe Z.
                  Consequently, it follows that y = ax = a(by) = (ab)y, and since y # 0, we have ab = I.
                  Because    a, be Zt,ab=1>a=b=1,s0                      y=x      and   &     is antisymmetric
                                                                                                          — hence            it
                  defines a partial order for the set A.
                     Now suppose we wish to know how many ordered pairs occur in this relation &. We
                  may simply list the ordered pairs from A X A that comprise &:
                          R = {0, 1), d, 2), d, 3), dd, 4, C1, 6), C1, 12), (2, 2), (2, 4), (2, 6),
                                 (2, 12), (3, 3), (3, 6), (3, 12), (4, 4), (4, 12), (6, 6), (6, 12), (12, 12)}
                  In this way we learn that there are 18 ordered pairs in the relation. But if we then wanted to
                  consider the same type of partial order for the set of positive integer divisors of 1800, we
                  should definitely be discouraged by this method of simply /isting all the ordered pairs. So
342          Chapter 7 Relations: The Second Time Around

let us examine the relation & a little closer. By the Fundamental Theorem of Arithmetic we
                             may write 12 = 2? - 3 and then realize that if (c, d) € R, then
                                                               c=2".3"        and     d=2?. 37,
                             where m,n,     p,geNwithO<m<p<2and0<n<q<l.
                                 When we consider the fact that 0 <m < p <2, we find that each possibility for m, p
                             is simply a selection of size 2 from a set of size 3——namely, the set {0, 1, 2}   -— where
                             repetitions are allowed. (In any such selection, if there is a smaller nonnegative integer,
                             then it is assigned to m.) In Chapter 1 we learned that such a selection can be made in
                              (°*+5~') = () = 6 ways. And, in like manner, n and g can be selected in (**5~') = 3) =
                             3 ways. So by the rule of product there should be (6)(3) = 18 ordered pairs in R —as we
                             found earlier by actually listing all of them.
                                Now suppose we examine a similar situation, the set of positive integer divisors of
                              1800 = 23 . 3° . 5*. Here we are dealing with (3 + 1)(2 + 1)(2 + 1) = (4)(3)(3) = 36 divi-
                              sors, and a typical ordered pair for this partial order (given by division) looks like (2” - 3° - 5’,
                             2” .3".5”),     where   r,s,t,u,v,weEN         with    O<r<u<3,0<s5
                                                                                             <v <2, and                     Q<r<
                              w < 2. So the number of ordered pairs in the relation is

E)C
                                          Y= QE
                                      EYEPI        rere = s00
                                                -aoE
                             and we definitely should not want to have to list all of the ordered pairs in the relation in
                             order to obtain this result.
                                 In general, forn € Zt with n > 1, use the Fundamental Theorem of Arithmetic to write
                             n= pj! psp; +--+ py’, where k € Z*, p < po < p3 <--+ < pg, and p; is prime and e; €
                             Z* for each 1 <i <k. Then n has Hk, (e; + 1) positive integer divisors. And when we
                             consider the same type of partial order for this set (of positive integer divisors of 2), we
                             find that the number of ordered pairs in the relation is

1 (rrr                  t) _ I (“ 37)
                                                                        2                i=]     2

In closing this section we introduce the equivalence relation—a concept that is very
                              important in the study of mathematics.

Definition 7.7         An equivalence relation R on a        set A is a relation that is reflexive, symmetric, and transi-
                             tive.

a) The relation in Example 7.1(b) and all the relations in Example 7.3(c) are equivalence
      EXAMPLE 7.16
                                   relations.
                                b) IfA = {1, 2, 3}, then
                                     Ry    {(1, 1), (2, 2), (3, 3)},
                                     Ry    {(, 1), (2, 2), (2, 3), (3, 2), 3, 3)},
                                     Rs    {(1, 1), 1, 3), (2, 2), (3, 1), G, 3)}, and
                                     Ry = {, 1), C1, 2), Cd, 3), (2, 1), (2, 2), (2, 3), 3, 1), GB, 2), B,3)) = AXA
                                     are all equivalence relations on A.
                                c) For a given finite set A, A X A is the largest equivalence relation on A, and if A =
                                   {a,, a2,..., Ga}, then the equality relation R = {(a;, a;)|1 <i <n} is the smallest
                                     equivalence relation on A.
                                                                                     7.1     Relations Revisited: Properties of Relations              343

d) Let A = {1, 2,3, 4,5, 6, 7}, B = {x, y, z}, and f: A — B be the onto function

f ={d, x), (2, 2), 3, x), 4 y), G, 2), 6, y), (7, xD}.
                                            Define the relation 2 on A by aR b if f(a) = f(b). Then, for instance, we find
                                         here that]R1,1R3,2AR5,3R1,and4AR6.
                                             For each a € A, f(a) = f(a) because f is a function—soa & a, and & is reflex-
                                         ive. Now suppose thata,b<¢ A anda&b. ThenaRb= f(a) = f(b) > f(b) =
                                         f(a) >bRa, so R is symmetric. Finally, if a,b,c e A with aRKb and bRe,
                                         then   f(a)   = f(b)   and     f(b) = f(c).          Consequently,      f(a) = f(c),       and we    see that
                                         (akbAbRe) >aRkc. So R is transitive. Since R is reflexive, symmetric, and
                                         transitive, it is an equivalence relation.
                                             Here = {(1, 1), C1, 3), C1, 7), (2, 2), (2, 5), 3, 1), GB, 3), GB, 7), (4, 4, (4, 6),
                                         (5, 2), 5, 5), (6, 4), (6, 6), (7, 1), (7, 3), (7, 7}.
                                     e) If R is a relation ona set A, then & is both an equivalence relation and a partial order
                                        on A if and only if & is the equality relation on A.

e) R is the relation on Z where x KR y ifx + y               is odd.
                          EXERCISES 7.1
                                                                                f)         KR is the relation on Z where x & y ifx — y is even.
1. If A = (1, 2, 3, 4}, give an example of a relation ® on A                   g) Let T be the set of all triangles in R’. Define R on T by
that is                                                                         t; Rt         if t; and t, have an angle of the same measure.
     a) reflexive and symmetric, but not transitive                             h) Ris the relationon Z X Z where (a, b)R(c, d)ifa <c.
     b) reflexive and transitive, but not symmetric                             [Note: R C (ZX Z) X (ZX Z).)
                                                                             6. Which relations in Exercise 5 are partial orders? Which are
     c) symmetric and transitive, but not reflexive
                                                                            equivalence relations?
  2. For relation (b) in Example 7.1, determine five values ofx
                                                                              7. Let R,, Ry be relations on a set A. (a) Prove or disprove
for which (x, 5) € R.
                                                                            that 2), R2 reflexive => KR, MR reflexive. (b) Answer part (a)
3. For the relation & in Example 7.13, let f: Z* —+ R where                when each occurrence of “reflexive” is replaced by (i) symmet-
f(y =n.                                                                     ric; (ii) antisymmetric; and (iii) transitive.
     a) Find three elements f;. fo. f; € ¥ such that f, R f and              8. Answer Exercise 7, replacing each occurrence of M by U.
     FR, foralli<i <3.
                                                                              9. For each of the following statements about relations on a
     b) Find three elements g), 22, 23 € & such that g, R f but             set A, where |A| = n, determine whether the statement is true
     f#Rg,, forall 1 <i <3.                                                 or false. If it is false, give a counterexample.
4, a) Rephrase    the definitions   for the reflexive,   symmetric,
                                                                                a) If Ris a relation on A and |R| > n, then & is reflexive.
     transitive, and antisymmetric properties of a relation & (on
                                                                                b) If R,, A>          are relations on A and RM, D Ry, then Ry,
     a set A), using quantifiers.
                                                                                reflexive (symmetric, antisymmetric, transitive) => QR, re-
     b) Use the results of part (a) to specify when a relation &                flexive (symmetric, antisymmetric, transitive).
     (on a set A) is (i) not reflexive; (ii) not symmetric; (ii1) not
                                                                                c) If R,, R» are relations on A and Rz DR, then Rz
     transitive; and (iv) not antisymmetric.
                                                                                reflexive (symmetric, antisymmetric, transitive) => W, re-
5. For each of the following relations, determine whether the                  flexive (symmetric, antisymmetric, transitive).
relation is reflexive, symmetric, antisymmetric, or transitive.
                                                                                d)     If Ris an equivalence relation on A, thenn           < |R| < n°.
     a) RCZ* XZ where a K b if alb (read “a divides b,”
                                                                            10. If A = {w. x, y, z}, determine the number of relations on
     as defined in Section 4.3).
                                                                            A that are (a) reflexive; (b) symmetric; (c) reflexive and sym-
     b) & is the relation on Z where a & b if alb.                          metric; (d) reflexive and contain (x, y); (e) symmetric and con-
     c) Fora given universe U and a fixed subset C of U, define             tain (x, y); (f) antisymmetric; (g) antisymmetric and contain
     R on PU) as follows: For A, B CU we have AR B if                       (x, y);(h) symmetric and antisymmetric; and (1) reflexive, sym-
     ANC=BNC.                                                               metric, and antisymmetric.
     d) On the set A of all lines in R’, define the relation & for          11. Let n € Z* with n > 1, and let A be the set of positive in-
     two lines €), €2 by £; R €2 if €,   is perpendicular to @).            teger divisors of n. Define the relation R on A by x RK y                   if x
344           Chapter 7 Relations: The Second Time Around

(exactly) divides y. Determine how many ordered pairs are in             a) Give an example of a relation R on Z where MR is ir-
the relation & when n is (a) 10; (b) 20; (c) 40; (d) 200; (e) 210;       reflexive and transitive but not symmetric.
and (f) 13860.                                                           b) Let & be a nonempty relation ona set A. Prove that if R
12. Suppose that p;, p2, p3 are distinct primes and that n, k €          satisfies any two of the following properties — irreflexive,
Z* with n = p} p;p§. Let A be the set of positive integer divi-          symmetric, and transitive — then it cannot satisfy the third.
sors of n and define the relation R on A by x KR yif x (exactly)         c) If |A| =” > 1, how many different relations on A are
divides y. If there are 5880 ordered pairs in &, determine k             irreflexive? How many are neither reflexive nor irreflexive?
and |A|.
                                                                     17. Let A = {1, 2, 3, 4,5, 6,7}. How many symmetric rela-
13. What is wrong with the following argument?                       tions on A contain exactly (a) four ordered pairs? (b) five or-
    Let A bea set with &% a relation on A. If R is symmetric and     dered pairs? (c) seven ordered pairs? (d) eight ordered pairs?
transitive, then 2 is reflexive.
                                                                     18. a) Let   f: A—    B,   where   |A| = 25, B = {x, y, z},   and
    Proof: Let (x, y) € R. By the symmetric property, (y, x) €
R. Then with (x, y). (vy, x) € R, it follows by the transitive           |f-'(&)| = 10, | f-'G)| = 10, | f-"(@)| = 5. If we define
                                                                         the relation
                                                                                   2 on A bya R bifa, be Aand f(a) = f(b),
property that (x, x) € R. Consequently, RK is reflexive.
                                                                         how many ordered pairs are there in the relation R?
14, Let A be a set with |A| =n, and let ® be a relation on
                                                                         b) For n, 1, 12,3, 14 € Z*, let f: A>        B, where
A that is antisymmetric. What is the maximum value for |R|?
How many antisymmetric relations can have this size?                     |Al =n, B={w, x,y, 2}, |f7'(w)| =,            |F 7 @)I = ro.
                                                                         If 'O)| = 13, |f7'(2)| = ng, and ny, +2 +713 +74 = 7.
15. Let A be a set with |A| = n, and let # be an equivalence             If we define the relation R on A by aRb if a,beA
relation on A with |R| = r. Why is r — n always even?                    and f(a) = f(b), how many ordered pairs are there in the
16. A relation & on a   set A is called irreflexive if for all a €       relation R?
A, (a, a) €R,

7.2
Computer Recognition: Zero-One Matrices
         and Directed Graphs
                                Since our interest in relations is focused on those for finite sets, we are concerned with ways
                                of representing such relations so that the properties of Section 7.1 can be easily verified. For
                                this reason we now develop the necessary tools: relation composition, zero-one matrices,
                                and directed graphs.

In a manner analogous to the composition of functions, relations can be combined in the
                                following circumstances.

Definition 7.8            If A, B, and C are sets with R; CA X B and KR, C BX C, then the composite relation
                                R, oR, is a relation from A to C defined by R, oR» = {(x, z)|x € A, z € C, and there
                                exists y € B with (x, y) € Ry, (y, z) € Ry}.

Beware! The composition of two relations is written in an order opposite to that for
                                function composition. We shall see why in Example 7.21.

Let A = {1, 2,3, 4}, B = {w, x, y, z}, and C = {5, 6, 7}. Consider R,             = {(1, x), (2, x),
:     EXAMPLE 7.17              (3, y), (3, z)}, a relation from A to B, and R> = {(w, 5), (x, 6)}, a relation from B to
                                C. Then Ry o Kz = {(1, 6), (2, 6)} is a relation from A to C. If R3 = {(w, 5), (w, 6)} is
                                another relation from B to C, then R; o R3 = G.
                                                7.2 Computer Recognition: Zero-One Matrices and Directed Graphs         345

Let A be the set of employees at a computing center, while B denotes a set of high-level
   EXAMPLE 7.18
                      programming languages, and C is a set of projects {p), p2,..., ps} for which managers
                      must make work assignments using the people in A. Consider &, C A X B, where an or-
                      dered pair of the form (L. Alldredge, Java) indicates that employee L. Alldredge is proficient
                      in Java (and perhaps other programming languages). The relation R2 C B X C consists of
                      ordered pairs such as (Java, pz), indicating that Java is considered an essential language
                      needed by anyone who works on project p2. In the composite relation R, oR» we find
                      (L. Alldredge, p2). If no other ordered pair in Ry has p2 as its second component, we know
                      that if L. Alldredge was assigned to p> it was solely on the basis of his proficiency in Java.
                      (Here &; o RM, has been used to set up a matching process between employees and projects
                      on the basis of employee knowledge of specific programming languages.)

Comparable to the associative law for function composition, the following result holds
                      for relations.

THEOREM 7.1           Let A, B, C, and D be sets with R; CA                X B, Ro CBXC,         and R3 CC        X D. Then
                      Rio (Ry o Kz) = (Ry; o Ry) o Rs.
                      Proof: Since both R, o (Ro R3) and (RK; oR») oR;                 are relations from A to D, there
                      is some     reason   to believe    they are equal.   If (a, d) € R;   o (R2 o W3),   then there is an
                      element be B with (a,b) eR, and (b, d) € (Rp o V3). Also, (b, d) € (Rp 0 R3) |
                      (b,c) ER, and (c,d) eR        for some ceC. Then (a,b)eR, and (b,ch)e Roa
                      (a,c) ER, o Ry. Finally, (a,c) eR; o R2 and (c, d) E R3 = (a, d) € (RK, o Ry) o Rs,
                      and R, o (Rz o R3) C (RK; oR2) oR. The opposite inclusion follows by similar rea-
                      soning.

As a result of this theorem no ambiguity arises when we write R, o Ry o Rs for either
                      of the relations in Theorern 7.1. In addition, we can now define the powers of a relation R
                      on a set.

Definition 7.9    Given aset A and arelation ® on A, we define the powers of & recursively by (a) R' = KR;
                      and (b) forn € Z7, KR"!           = RoR",

Note that for n € Z*, R” is arelation on A.

IfA = {1, 2, 3, 4}and&R = {(1, 2), (1, 3), (2, 4), G, 2)}, then KR? = {(1, 4), (1, 2), B, 4},
   EXAMPLE 7.19
                      R> = {(1, 4)}, and forn > 4, R" = G.

As the set A and the relation & on A grow larger, calculations such as those in Example
                      7.19 become tedious. To avoid this tedium, the tool we need is the computer, once a way
                      can be found to tell the machine about the set A and the relation % on A.

Definition 7.10   An m X n zero-one matrix E = (€,;)mxn is a rectangular array of numbers arranged in m
                      rows and n columns, where each e,,, for 1 <i <m and 1 < j <n, denotes the entry in the
                      ith row and jth column of E, and each such entry is 0 or 1. [We can also write (0, 1)-matrix
                      for this type of matrix. ]
346         Chapter 7 Relations: The Second Time Around

The matrix
      EXAMPLE 7.20
                                                                                                 1            0       0    1
                                                                                   E=/]0                      1       0    1
                                                                                                 100                       0

isa3 x 4 (0, 1)-matrix where, for example, e;; = 1, e23 = 0, and e3; = 1.

In working with these matrices, we use the standard operations of matrix addition and
                            multiplication with the stipulation that 1 + 1 = 1. (Hence the addition is called Boolean.)

Consider the sets A, B, and C and the relations &,, Ft. of Example 7.17. With the orders
      EXAMPLE 7.21
                            of the elements in A, B, and C fixed as in that example, we define the relation matrices for
                            Ri, R» as follows:

(w)       (x)         (y)    (©)                                                         (5)            (6)   (7)
                                               (1)|          0           1        0         0                                             (w)|   ]                  0     0
                                   M(R1)     = (2)}          O           1        0         0                             M(R2)         = (x) | O                   1     0
                                                (3)|         0        0            1         1           |’                                  (y) | 0                0     0
                                                (4)|         0        0           0         0                                                (z) | 0                0     0

In constructing M(R,), we are dealing with a relation from A to B, so the elements of A
                             are used to mark the rows of M (22) and the elements of B designate the columns. Then to
                             denote, for example, that (2, x) € Ry, we place a 1 in the row marked (2) and the column
                            marked (x). Each 0 in this matrix indicates an ordered pair in A X B that is missing from
                            R,. For example, since (3, w) ¢ Rj, there is a 0 for the entry in row (3) and column (w)
                            of the matrix M(R,). The same process is used to obtain M(R).
                               Multiplying these matrices,’ we find that

(5)    (6)         (7)
                                                       0         1   0       0          1            0            0            (1)}     0      ]               0
                                                       0         1   0       0         0             1            0              2)|    0      1               0

0         0   0       0         0             0            0            (4)|0           O          QO

where the rows of the 4 X 3 matrix M(R, o R2) are marked by the elements of A while its
                            columns are marked by the elements of C. In general we have: If &, is a relation from A
                            to B and & is a relation from B to C, then M(R,) - M(R2) = M(R, o KR»). That is, the
                            product of the relation matrices for R,, Ro, in that order, equals the relation matrix of the
                            composite relation R; o Ry. (This is why the composition of two relations was written in
                             the order specified in Definition 7.8.)

The reader will be asked to prove the general result of Example 7.21, along with some
                             results from our next example, in Exercises 11 and 12 at the end of this section.

Further properties of relation matrices are exhibited in the following example.

"The reader who is not familiar with matrix multiplication or simply wishes a brief review should consult
                            Appendix 2.
                                             7.2 Computer Recognition: Zero-One Matrices and Directed Graphs   347

| EXAMPLE 7.22       Let A = {1, 2, 3, 4} and R = {(1, 2), C1, 3), (2, 4), (3, 2)}, as in Example 7.19. Keeping
                     the order of the elements in A fixed, we define the relation matrix for R as follows: M(R)
                     is the 4 x 4 (Q, 1)-matrix whose entries m,,, for 1 <i, j < 4, are given by

m=      1,          ifG, peR,
                                                         J     0,           otherwise.

In this case we find that

1

O&

cor

oor &
                                                                             0
                                                        MCR) =               1

ooCc

Qo
                                                                             0

Now how can this be of any use? If we compute (M@(&))” using the convention that
                     1+    1 = 1, then
                                    we find      that

0         10           1
                                                                7_|0             0 0           0

0        0     0       0

which happens to be the relation matrix for R oR = R*. (Check Example 7.19.) Further-
                     more,

0        0     0       0
                                                                4_}]0            0     0       0

000                    0

which is also the relation matrix for the relation R* — that is, (M(R))* = M(R?4). Also,
                     recall that R* = G, as we learned in Example 7.19.
                          What has happened here carries over to the general situation. We now state some results
                     about relation matrices and their use in studying relations.

Let A be a set with {A] =         and & a relation on A. If M(R) is the relation matrix for
                      R, then:

a) M(R) = 6 (the matrix of all 0’s) if and only if R = 9
                            b) M (St) = 1 (the matrix of all 1’s) if and only if®@ = AXA
                            c) M(R™) = (M(R)}", form « Zt

Using the (0, 1)-matrix for a relation, we now turn to the recognition of the reflex-
                     ive, symmetric, antisymmetric, and transitive properties. To accomplish this we need the
                     concepts introduced in the following three definitions.

Definition 7.11   Let E = (€,))mxn, F = (fi;)mxn be two m X n (QO, 1)-matrices. We say that E precedes, or
                     is less than, F, and we write E < F ife,, < f,,,foralll <i<m,1l<j<n.
348          Chapter 7 Relations: The Second Time Around

.       —;/1            0    1          _ fi
                             with    £ =|              0    i] ana   P=     0             |    we have   E < F. In fact, there are eight
      EXAMPLE 7.23
                             (0, 1)-matrices G for which E <G.,

Definition 7.12        Forn € Z*, I, = (6ij)nxn is the n X n (O, 1)-matrix where

s afl                  ifiss
                                                                          U10,                ifi Fy.

Definition 7.13         Let A = (4ij)mxn be a (OQ, 1)-matrix. The transpose of A, written A" is the matrix (Qi nxm
                              where aj; =a;;,foralll <j <n, 1<i<m.

0        1
                              ForA=1/0        0O           we find that A" = |        0        |
      EXAMPLE 7.24                        1                                      1    0        1
                                 As this example demonstrates, the 7th row (column) of A equals the ith column (row)
                              of A". This indicates a method we can use in order to obtain the matrix A" from the
                              matrix A.

THEOREM 7.2                   Given a set A with |A| = n and arelation R on A, let M denote the relation matrix for R.
                              Then

a) R&R is reflexive if and only if J, < M.
                                b)   KR is symmetric if and only if M = M".
                                c) &R is transitive if and only if M-M               = M? <M.
                                d)    KR is antisymmetric if and only if MMM" < J,. (The matrix MMM" is formed
                                     by operating on corresponding entries in M and M" according to the rules ON 0 =
                                     0N1=1N0=Oand1M1 = 1—thatis, the usual multiplication for 0’s and /or 1’s.)
                              Proof: The results follow from the definitions of the relation properties and the (0, 1)-matrix.
                              We demonstrate this for part (c), using the elements of A to designate the rows and columns
                              in M, as in Examples 7.21 and 7.22.
                                 Let M? < M. If (x, y), (y, z) €&, then there are 1’s in row (x), column (y) and in
                              row (y), column (z) of M. Consequently, in row (x), column (z) of M? there is a 1. This 1
                              must also occur in row (x), column (z) of M because M? < M. Hence (x, z) € Rand Ris
                              transitive.
                                  Conversely, if & is transitive and M is the relation matrix for R, let s,, be the entry in
                              row (x) and column (z) of M?, with s,, = 1. For s,, to equal 1 in M?, there must exist at
                              least one y € A where m,, = my, = 1 in M. This happens only if x & y and y R z. With
                             RR transitive, it then follows that x R z. So m,, = land M* < M.
                                 The proofs of the remaining parts are left to the reader.

The relation matrix is a useful tool for the computer recognition of certain properties
                              of relations. Storing information as described here, this matrix is an example of a data
                                            7.2 Computer Recognition: Zero-One Matrices and Directed Graphs                 349

structure. Also of interest is how the relation matrix is used in the study of graph theory"
                  and how graph theory is used in the recognition of certain properties of relations.
                     At this point we shall introduce some fundamental concepts in graph theory. Often these
                  concepts will be given within examples and not in terms of formal definitions. In Chapter 11,
                  however, the presentation will not assume what is given here and will be more rigorous and
                  comprehensive.

Definition 7.14   Let V be a finite nonempty set. A directed graph (or digraph) G on V is made up of the
                  elements of V, called the vertices or nodes of G, and a subset E, of V X V, that contains
                  the (directed) edges, or arcs, of G. The set V is called the vertex set of G, and the set E is
                  called the edge set. We then write G = (V, £) to denote the graph.
                     If a,b eV and (a, b) € E*, then there is an edge from a to b. Vertex a is called the
                  origin or source of the edge, with b the terminus, or terminating vertex, and we say that b
                  is adjacent from a and that a is adjacent to b. In addition, if a # b, then (a, b) # (b, a). An
                  edge of the form (a, a) 1s called a loop (at a).

For V = {1, 2, 3, 4, 5}, the diagram in Fig. 7.1 is a directed graph G on V with edge set
EXAMPLE 7.25
                  {(, 1), (1, 2), C1, 4), (G, 2)}. Vertex 5 is a part of this graph even though it 1s not the origin
                  or terminus of an edge. It is referred to as an isolated vertex. As we see here, edges need
                  not be straight line segments, and there is no concern about the length of an edge.

4                                  °                  (a)                         (b)
                  Figure 7.1                                                 Figure 7.2

When we develop a flowchart to study a computer program or algorithm, we deal with
                  a special type of directed graph where the shapes of the vertices may be important in the
                  analysis of the algorithm. Road maps are directed graphs, where the cities and towns are
                  represented by vertices and the highways linking any two localities are given by edges. In
                  road maps, an edge is often directed in both directions. Consequently, if G is a directed
                  graph anda, b € V, witha # b, and both (a, b), (b, a) € E, then the single undirected edge
                  {a, b} = {b, a} in Fig. 7.2(b) is used to represent the two directed edges shown in Fig. 7.2(a).
                  In this case, a and b are called adjacent vertices. (Directions may also be disregarded for
                  loops.)

* Since the terminology of graph theory is not standardized, the reader may find some differences between
                  definitions given here and in other texts.
                      *In this chapter we allow only one edge from a to b. Situations     where   multiple edges occur are called
                  multigraphs. These are discussed in Chapter 11.
350         Chapter 7 Relations: The Second Time Around

Directed graphs play an important role in many situations in computer science. The
                            following example demonstrates one of these.

Computer programs can be processed more rapidly when certain statements in the program
      EXAMPLE 7.26
                            are executed concurrently. But in order to accomplish this we must be aware of the de-
                            pendence of some statements on earlier statements in the program. For we cannot execute
                            a statement that needs results from other statements— statements that have not yet been
                            executed.
                               In Fig. 7.3(a) we have eight assignment statements that constitute the beginning of
                            a computer program. We represent these statements by the eight corresponding vertices
                            S}, 82, 53, ..., Sg in part (b) of the figure, where a directed edge such as (s;, s5) indicates
                             that statement ss; cannot be executed until statement s; has been executed. The resulting
                             directed graph is called the precedence graph for the given lines of the computer program.
                             Note how this graph indicates, for example, that statement s7 cannot be executed until after
                             each of the statements 5), 52, 53, and s4 has been executed. Also, we see how a statement such
                             as Ss; must be executed before it is possible to execute any of the statements 52, 54, 85, 57, OF
                             sg. In general, if a vertex (statement) s is adjacent from m other vertices (and no others), then
                             the corresponding statements for these         vertices must be executed before statement s can
                             be executed. Similarly, should a vertex (statement) s be adjacent to n other vertices, then
                             each of the corresponding statements for these vertices requires the execution of statement
                             s before it can be executed. Finally, from the precedence graph we see that the statements
                             51, 83, and s¢ can be processed concurrently. Following this, the statements 52, s4, and sg
                             can be executed at the same time, and then the statements s5 and s7. (Or we could process
                             statements s2 and s4 concurrently, and then the statements s5, 57, and sg.)

Ss             S7

(s1)   Bb i= 3
                                               (So)   Ci= b+2
                                               (s3)   @a:= |
                                               (4)    d=   a*b4+5                                                   $8
                                               (ss)   e:= d-1
                                               (ss)   f t= 7
                                               (s7)   ec= ctd
                                               (ss)   g i= b*f                 53              5,              56

(a)                             (b)
                                     Figure 7.3

Now we want to consider how relations and directed graphs are interrelated. For a start,
                             given a set A and a relation & on A, we can construct a directed graph G with vertex set A
                             and edge set E C A X A, where (a, b) € E ifa, be          A anda & b. This its demonstrated in
                             the following example.

For A = {1, 2, 3, 4}, let AR = {c1, 1), C1, 2), (2, 3), (3, 2), GB, 3), GB, 4), (4, 2)} be a rela-
      EXAMPLE 7.27
                             tion on A. The directed graph associated with & is shown in Fig. 7.4(a), where the undirected
                             edge {2, 3}(= {3, 2}) is used in place of the pair of distinct directed edges (2, 3) and (3, 2).
                             If the directions in Fig. 7.4(a) are ignored, we get the associated undirected graph shown in
                                          7.2 Computer Recognition: Zero-One Matrices and Directed Graphs       351

part (b) of the figure. Here we see that the graph is connected in the sense that for any two
                  vertices x, y, with x # y, there is a path starting at x and ending at y. Such a path consists
                  of a finite sequence of undirected edges, so the edges {1, 2}, {2, 4} provide a path from 1 to
                  4, and the edges {3, 4}, {4, 2}, and {2, 1} provide a path from 3 to 1. The sequence of edges
                  {3, 4}, {4, 2}, and {2, 3} provides a path from 3 to 3. Such a closed path is called a cycle.
                  This is an example of an undirected cycle of /ength 3, because it has three edges in it.

(a)                  (b)                       (c)                         (d)
                  Figure 7.4

When we are dealing with paths (in both directed and undirected graphs), no vertex
                  may be repeated. Therefore, the sequence of edges {a, b}, {b. e}, {e, f}, (f, b}, {b, d} in
                  Fig. 7.4(c) is not considered to be a path (from a to d) because we pass through the vertex b
                  more than once. In the case of cycles, the path starts and terminates at the same vertex and has
                  at least three edges. In Fig. 7.4(d) the sequence of edges (b, f), (f, e), (e, a), (d, c), (c, b)
                  provides a directed cycle of length 5. The six edges (b, f), (f, e), (e, b), (b, d), (d, oc),
                  (c, b) do not yield a directed cycle in the figure because of the repetition of vertex b. If their
                  directions are ignored, the corresponding six edges, in part (c) of the figure, likewise pass
                  through vertex b more than once. Consequently, these edges are not considered to form a
                  cycle for the undirected graph in Fig. 7.4(c).
                      Now since we require a cycle to have length at least 3, we shall not consider loops to be
                  cycles, We also note that loops have no bearing on graph connectivity.

We choose to define the next idea formally because of its relevance to what we did earlier
                  in Section 6.3.

Definition 7.15   A directed graph G on V is called strongly connected if for all x, y € V, where x # y,
                  there is a path (in G) of directed edges from x to y — that is, either the directed edge (x, y)
                  is in G or, for some n € Z*     and distinct vertices vj, v2,...,      U, € V, the directed edges
                  (x, v,), (Vy, V2),..-, (Un, y) are in G,

It is in this sense that we talked about strongly connected machines in Chapter 6. The
                  graph in Fig. 7.4(a) is connected but not strongly connected. For example, there is no
                  directed path from 3 to 1. In Fig. 7.5 the directed graph on V = {1, 2, 3, 4} is strongly
                  connected and loop-free. This is also true of the directed graph in Fig. 7.4(d).
352         Chapter 7 Relations: The Second Time Around

OO)    1       2
                                                                                                        e
                                                                                                        1
                                                                                                                       ;
                                                  2
                                                                                4       3               4

(R,)                          (R,)
                               4
                            Figure 7.5                            Figure 7.6

For A = {1, 2, 3, 4}, consider the relations R&, = {(1, 1), (1, 2), (2, 1), (2, 2), (3, 3),
      EXAMPLE 7.28
                            (3, 4), (4, 3), (4, 4)} and R2 = {(2, 4), (2, 3), 3, 2), GB, 3), GB, 4}. As Fig. 7.6 illustrates,
                            the graphs of these relations are disconnected. However, each graph is the union of two
                            connected pieces called the components of the graph. For 2%, the graph is made up of two
                            strongly connected components. For #2, one component consists of an isolated vertex, and
                            the other component is connected but not strongly connected.

The graphs in Fig. 7.7 are examples of undirected graphs that are loop-free and have an
      EXAMPLE 7.29
                            edge for every pair of distinct vertices. These graphs illustrate the complete graphs on n
                            vertices which are denoted by K,,. In Fig. 7.7 we have examples of the complete graphs on
                            three, four, and five vertices, respectively. The complete graph K> consists of two vertices
                            x, y and an edge connecting them, whereas the complete graph K consists of one vertex
                            and no edges because loops are not allowed.

1

(K3)                     (K4)                       (Ks)
                                    Figure 7.7

In this drawing of Ks     two edges cross, namely,   {3, 5} and     {1, 4}. However,       there is
                            no point of intersection creating a new vertex. If we try to avoid the crossing of edges by
                            drawing the graph differently, we run into the same problem all over again. This difficulty
                            will be examined in Chapter 11 when we deal with the planarity of graphs.

A digraph G on a vertex set V gives rise to a relation ® on V where x R y if (x, y) is an
                            edge in G. Consequently, there is a (0, 1)-matrix for G, and since this relation matrix comes
                            about from the adjacencies of pairs of vertices, it is referred to as the adjacency matrix for
                            G as well as the relation matrix for R.
                                        7.2 Computer Recognition: Zero-One Matrices and Directed Graphs        353

At this point we tie together the properties of relations and the structure of directed
               graphs.

If A = {1, 2, 3} and R = {(1, 1), (1, 2), (2, 2), 3, 3), (3, 1}, then &           is a reflexive an-
EXAMPLE 7.30   tisymmetric relation on A, but it is neither symmetric nor transitive. The directed graph
               associated with & consists of five edges. Three of these edges are loops that result from the
               reflexive property of &. (See Fig. 7.8.) In general, if & is a relation on a finite set A, then
               R is reflexive if and only if its directed graph contains a loop at each vertex (element of A).

The relation R = {(1, 1), C1, 2), (2, 1), (2, 3), G, 2)} is symmetric on A = {1, 2, 3}, but
EXAMPLE 7.31
               it is not reflexive, antisymmetric, or transitive. The directed graph for & is found in
               Fig. 7.9. In general, a relation & on a finite set A is symmetric if and only if its directed
               graph may be drawn so that it contains only loops and undirected edges.

For A = {1, 2, 3}, considerR = {(1, 1), 1, 2), (2, 3), (1, 3)}. The directed graph forR is
EXAMPLE 7.32
               shown in Fig. 7.10. Here & is transitive and antisymmetric but not reflexive or symmetric.
               The directed graph indicates that a relation on a set A is transitive if and only if it satisfies
               the following: For all x, y € A, if there is a (directed) path from x to y in the associated
               graph, then there is an edge (x, y) also. [Here (1, 2), (2, 3) is a (directed) path from 1 to 3,
               and we also have the edge (1, 3) for transitivity.] Notice that the directed graph in Fig. 7.3
               of Example 7.26 also has this property.
                   The relation & is antisymmetric because there are no ordered pairs in & of the form
               (x, y) and (y, x} with x # y. To use the directed graph of Fig. 7.10 to characterize anti-
               symmetry, we observe that for any two vertices x, y, with x # y, the graph contains at most
               one of the edges (x, y) or (y, x). Hence there are no undirected edges aside from loops.

Figure 7.8                               Figure 7.9                            Figure 7.10

Our final example deals with equivalence relations.

For A = {1, 2, 3, 4, 5}, the following are equivalence relations on A:
EXAMPLE 7.33
                            Ry = {C, 1), C, 2), 2, 1), (2, 2), GB, 3), GB, 4), G, 3), 4, 4), 6, 5)},
                            Az = {C, 1), A, 2), C1, 3), 2, 1), @, 2), (2, 3), G1), G, 2), G3),
                                   (4, 4), (4, 5), 5, 4), 6, 5}.

Their associated graphs are shown in Fig. 7.11. If we ignore the loops in each graph, we
               find the graph decomposed into components such as K,, Kz, and K3. In general, a relation
               on a finite set A is an equivalence relation if and only if its associated graph is one complete
354             Chapter 7 Relations: The Second Time Around

graph augmented by loops at every vertex or consists of the disjoint union of complete
                                      graphs augmented by loops at every vertex.

Fy                                       Ra
                                                      Figure 7.11

1        0   1    1
                                EXERCISES 7.2                                  10.1f   E=|    0         0   0    1    |, how   many   (0, 1)-matrices F
                                                                                               1        0   0    0
1. For A = {1, 2, 3, 4}, let R and & be the relations on A                    satisfy E < F? How many (0, 1)-matrices G satisfy G < E?
defined byR = {(1, 2), C1, 3), (2, 4), (4, 4} andF               = {,    1),
                                                                               11. Consider the sets A = {a), a2,.... Gy}, B = {b, bz, ...,
(i, 2), Cl. 3), (2, 3), (2, 4}. Find Roof, FoR, R?, R3, F?,                    b,}, and C = {c), C2, ..., Cp}, where the elements in each set
and ¥?,                                                                        remain fixed in the order given here. Let &, be a relation from
2. If R is a reflexive relation on a set A, prove that R? is also             Ato B, and let R> be arelation from B to C. The relation matrix
reflexive on A.                                                                for KR, is M(R,), where i = 1, 2. The rows and columns of these
3. Provide a proof for the opposite inclusion in Theorem 7.1.                 matrices are indexed by the elements from the appropriate sets
                                                                               A, B, and C according to the orders already prescribed. The
4. Let A= {1, 2,3}, B={w, x, y, z}, and C = {4, 5, 6}.
                                                                               matrix for R, oR, is the m X p matrix M(R, oR2), where
Define the relations A) CA X BLR.CBXC, and RC
                                                                               the elements of A (in the order given) index the rows and the
BXC,      where   &, = {(1. w), 3, w), 2, x), C1, y)}, Ro =
                                                                               elements of C (also in the order given) index the columns.
{(w, 5), (x, 6), (y, 4), Gy, )},          and = Rs = {(w, 4), (w, 5),
                                                                                   Show that for all 1 <i < mandi <j < p, the entries in the
(y. 5)}. (a) Determine %, 0 (W2UR3)     and                    (RM; oR) VU
                                                                               ith row and jth column of M(R,)- M(Ry) and M(R, o Rp)
(RK, oR3). (b) Determine KR, o (R2NAR3) and                    (RM, oR.) N
                                                                               are equal. [Hence M(R) - M(R2) = M(R, o R2).)
(R; oR).
                                                                               12. Let A be a set with |A] =n, and consider the order for
5. Let A = {1, 2}, B = {m,n, p}, and C = (3, 4}. Define the
                                                                               the listing of its elements as fixed. For R C A X A, let M(R)
relations Ry; CA XB, Ry CBXC, and RzCBXC                  by
                                                                               denote the corresponding relation matrix.
R, = {C,m), 1,2), 4, p)},               Re = ((m, 3), Om, 4). (p, Y},
and R; = {(m, 3), (m, 4), (p, 3)}. Determine R, o (R2 NAVs)                        a) Prove that M(R)           = 0 (then X n matrix of all 0’s) if and
and (R; o Rr) N(R; o Ks).                                                          only ifR = G.
6. For sets     A, B, and C, consider relations R,; C A X B,                      b) Prove that M(R) = 1 (then X n matrix of all 1’s) if and
Ry    Cc BX   C, and       Cc   B   XC. Prove    that (a) Ry   ° (Ry   UR)         only ifR = AX A.
= (Ry o Ry) U (RM; o R3); and (b) R; o (Mz NRz)                                    c) Use the result of Exercise 11, along with the Princi-
(R, o Ra) N(R o Ra).                                                               ple of Mathematical Induction, to prove that M(R”) =
7. For a relation R on a set A, define R° = ((a, a)la € A}.                       [M(R)]", for all m € Z*.
If |A| = 7, prove that there exists,            € NwithO<s<t<            2"    13. Provide the proofs for Theorem 7.2(a), (b), and (d).
such that Re = RK’.
                                                                               14. Use Theorem 7.2 to write a computer program (or to de-
  8. With    A= ({1,2,3,4},       let R= {d, 1), C1, 2), (2,3),                velop an algorithm) for the recognition of equivalence relations
(3, 3), (3, 4), (4, 4)} be a relation on A. Find two relations &,              on a finite set.
J on A where SFT but Rof=Ro7T = {C1 1), C1, 2),
                                                                               15. a) Draw the digraph G; = (V,, E,;) where V, = {a, b,c,
(i, 4)}.
                                                                                   d.e, f) and E, = ((a,b), (@,d), (b, ©), (be), (d,),
9, How many6          X 6 (0, 1)-matrices A are there with A = A"?                (d, e), (e,c), (@, f), (f, d)}.
                                                                              7.2 Computer Recognition: Zero-One Matrices and Directed Graphs                                 355

b) Draw the undirected graph G2 = (V2, E2) where V2 =                                   tion & C A X A in each case, as well as its associated relation
   {s,¢t,4,v,w,x,y,z}     and   EF, = {{s, t}, {s, u}, {s, x},                             matrix M(R).
   (t, u}, {t, w}, (u, w}, (u, x}, {v, wh, {v, x}, {v, y}, (w, z},                         18. ForA = {v, w, x, y, z}, each of the following is the (0, 1)-
   {x, y}}.                                                                                matrix for a relation R on A. Here the rows (from top to bot-
16. For the directed graph G = (V, E) in Fig. 7.12, classify                               tom) and the columns (from left to right) are indexed in the
each of the following statements as true or false.                                         order v, w, x, y, z. Determine the relation & C A X A in each
   a) Vertex c is the origin of two edges in G.                                            case, and draw the directed graph G associated with &.
   b) Vertex g is adjacent to vertex h.                                                                              0s
                                                                                                                 ee ee
                                                                                                             10141     1 4
    c) There is a directed path in G from d to b.
                                                                                                 a) M(A)=10     0 0 0 |
   d) There are two directed cycles in G.                                                                   000        0 1
                                                                                                          |0    0 0 0 0]
                                                            b
                                                                                                                     TO.      6u1dl6d1l     lh     OT
                                                                                                                          1   0       1    0       0
                                                                                                 b) M(A=}                 1   1       00 ~«21
                                                                                                                       10             0    0    1
                                                                                                                       0  0            1   1 «0
                                                                                           19, For A = {1, 2, 3, 4}, letR = (1, 1), C1, 2), (2, 3), GB, 3),
                                                                                           (3, 4)} be a relation on A. Draw the directed graph G on A that
                                                                                           is associated with R. Do likewise for R?, R>, and R*.
                                       g
                                                                                           20. a) Let G =(V, E) be the directed graph where                                   V =
                         Figure 7.12                                                           {1, 2, 3, 4,5, 6, 7} and FE = {Gi, ll <i <j <7}.
                                                                                                        i)   How many edges are there for this graph?
17, For A = {a, b, c,d, e, f}, each graph, or digraph, in                                              ii) Four of the directed paths in G from 1 to 7 may be
Fig. 7.13 represents a relation & on A. Determine the rela-                                                  given as:
                                                                                                             1) (1, 7);
                                                                                                             2) (1, 3), (3, 5), , 6), (6, 7);
                                                                                                             3) C1, 2), (2, 3), G, 7); and
                                                                                                             4) (1, 4), (4, 7).
                                                                                                             How many directed paths (in total) exist in G from
                                                                                                             1 to 7?
                                                                                                 b) Now        let n € Z*         where          n> 2, and consider the di-
                                                                                                 rected      graph   G = (V, E)            with        V = {1, 2,3,...,   }   and
                                                                                                 E={@, )ll<i<j <n}.
                                                                                                        i)   Determine |£]|.
                                                                                                       ii)   How many directed paths exist in G from 1 to n?
                                                                                                      iii)   If a,be Z* with 1 <a <b<n, how many di-
                                                                                                             rected paths exist in G from a to b?
                             b                                  b                                    (The reader may wish to refer back to Exercise 20 in
                                                                                                 Section 3.1.)
                                               Cc                        Cc
                 a                                  a                                      21. Let |A| = 5. (a) How many directed graphs can one con-
                                                                                           struct on A? (b) How many of the graphs in part (a) are actually
                                                                                           undirected?
                                       d                            d                      22. For |A| = 5, how many relations R# on A are there? How
                                                                                           many of these relations are symmetric?

e                     f            e           ef
                                                                                           23.   a)   Keeping the order of the elements fixed as 1, 2, 3, 4, 5,
                                                                                                 determine the (0, !) relation matrix for each of the equiva-
                                                                                                 lence relations in Example 7.33.
         (inl)                                      (iv)
                                                                                                 b) Do the results of part (a) lead to any generalization?
      Figure 7.13
356          Chapter 7 Relations: The Second Time Around

24. How many (undirected) edges are there in the complete                                      the smallest integer n > 1, such that 2” = R. What is the
graphs K,, K;, and K,, where n € Z*?                                                           smallest value of n > 1 for which the graph of &R” con-
25. Draw a precedence graph for the following segment found                                    tains some loops? Does it ever happen that the graph of 2”
at the start of a computer program:                                                            consists of only loops?
                                                                                               b) Answer the same questions from part (a) for the rela-
                         s 1                     :=1
                                ed

tion R on A = {1, 2, 3,..., 9, 10}, if the directed graph
                                     ye

Ss 2                    := 2
                                                                                               associated with & is as shown in Fig. 7.15.
                   NNN

Ss 3                    :=at+3
                                     qo

S 4                     := b
                         Ss 5                    r= 2*a-l
                                     Aa7ra

Ss 6                          a*c
                         Ss 7                    := 7
                         Ss 8                    :=C+2

26. a) Let  R be the relation on A = {1, 2, 3, 4, 5, 6, 7}, where
    the directed graph associated with & consists of the two                                         Figure 7.15
    components, each a directed cycle, shown in Fig. 7.14. Find
                                                                                               c) Do the results in parts (a) and (b) indicate anything in
                                                                                               general?
                                                                                           27. If the complete graph K,, has 703 edges, how many vertices
                                                                                           does it have?

4                             3               7         6
               Figure 7.14

7.3
        Partial Orders: Hasse Diagrams
                                                        If you ask children to recite the numbers they know, you’ll hear a uniform response of
                                                        “1, 2,3,....” Without paying attention to it, they list these numbers in increasing order.
                                                        In this section we take a closer look at this idea of order, something we may have taken for
                                                       granted, We start with some observations about the sets N, Z, Q, R, and C.
                                                           The set N is closed under the binary operations of (ordinary) addition and multiplication,
                                                       but if we seek an answer to the equation x + 5 = 2, we find that no element of N provides
                                                       a solution. So we enlarge N to Z, where we can perform subtraction as well as addition and
                                                       multiplication. However, we soon run into trouble trying to solve the equation 2x + 3 = 4.
                                                       Enlarging to Q, we can perform nonzero division in addition to the other operations. Yet
                                                       this soon proves to be inadequate; the equation x* — 2 = 0 necessitates the introduction
                                                       of the real but irrational numbers + /2. Even after we expand from Q to R, more trouble
                                                       arises when we try to solve x” + 1 = 0. Finally we arrive at C, the complex numbers,
                                                       where any polynomial equation of the formc,x” + ¢,-yx"7! +--+ + oox7 + e4x +06) = 0,
                                                       where c; € C forO <i <n,n >Oandc, #0, can be solved. (This result is known as the
                                                       Fundamental Theorem of Algebra. Its proof requires material on functions of a complex
                                                       variable, so no proof is given here.) As we kept building up from N to C, gaining more
                                                       ability to solve polynomial equations, something was lost when we went from R to C. In
                                                       R, given numbers    7}, r2, with r; 4 r2, we know     that either r} < ry or mr. < r;. However,
                                                       in C we have (2+ 7) # (1 +27), but what meaning can we attach to a statement such
                                                                       73 Partial Orders: Hasse Diagrams       357

as “(2 +24) < (1 + 27)? We have lost the ability to “order” the elements in this number
                  system!
                      AS we Start to take a closer look at the notion of order we proceed as in Section 7.1
                  and let A be a set with & a relation on A. The pair (A, %) is called a partially ordered
                  set, or poset, if relation R on A is a partial order, or a partial ordering relation (as given in
                  Definition 7.6). If A is called a poset, we understand that there is a partial order & on A
                  that makes A into this poset. Examples 7.1(a), 7.2, 7.11, and 7.15 are posets.

|_ EXAMPLE 7.34   Let A be the set of courses offered at a college. Define the relation ® on A by x R y if x, y
                  are the same course or if x is a prerequisite for y. Then &% makes A into a poset.

Define R on A = {1, 2, 3, 4} by x R y if x|y — that is, x (exactly) divides y. Then R =
   EXAMPLE 7.35
                  {d, Ll), (2, 2), G, 3), (4, 4), , 2), 1, 3), (1, 4, (2, 4} is a partial order, and (A, &) is
                  a poset. (This is similar to what we learned in Example 7.15.)

In the construction of a house certain jobs, such as digging the foundation, must be performed
   EXAMPLE 7.36
                  before other phases of the construction can be undertaken. If A is a set of tasks that must
                  be performed in building a house, we can define a relation R% on A by x R y if x, y denote
                  the same task or if task x must be performed before the start of task y. In this way we
                  place an order on the elements of A, making it into a poset that is sometimes referred to
                  as a PERT (Program Evaluation and Review Technique) network. (Such networks came
                  into play during the 1950s in order to handle the complexities that arose in organizing the
                  many individual activities required for the completion of projects on a very large scale. This
                  technique was actually developed and first used by the U.S. Navy in order to coordinate the
                  many projects that were necessary for the building of the Polaris submarine.)

Consider the diagrams given in Fig. 7.16. If part (a) were part of the directed graph
                  associated with a relation ®, then because (1, 2), (2, 1) € R with 1 4 2, R could not be
                  antisymmetric. For part (b), if the diagram were part of the graph of a transitive relation R,
                  then (1, 2), (2,3) «R= (1, 3) eR. Since (3, 1) € KR and 1 ¥ 3, RK is not antisymmetric,
                  so it cannot be a partial order.

(a)                    (b)
                                             Figure 7.16

From these observations, if we are given a relation & on a set A, and we let G be the
                  directed graph associated with &, then we find that:
                      i)   If G contains a pair of edges of the form (a, b), (b, a), fora, b € A witha     # b, or
358         Chapter 7 Relations: The Second Time Around

ii)   If & is transitive and G contains a directed cycle (of length greater than or equal to
                                      three),

then the relation & cannot be antisymmetric, so (A, &) fails to be a partial order.

Consider the directed graph for the partial order in Example 7.35. Figure 7.17(a) is the
      EXAMPLE 7.37
                            graphical representation of &. In part (b) of the figure, we have a somewhat simpler dia-
                            gram, which is called the Hasse diagram for R.

odo
                                                  (a)                             (b)
                                                Figure 7.17

When we know that a relation & is a partial order on a set A, we can eliminate the loops
                             at the vertices of its directed graph. Since & is also transitive, having the edges (1, 2) and
                             (2, 4) is enough to insure the existence of edge (1, 4), so we need not include that edge. In
                             this way we obtain the diagram in Fig. 7.17(b), where we have not lost the directions on
                             the edges — the directions are assumed to go from the bottom to the top.

In general, if R is a partial order on a finite set A, we construct a Hasse diagram for
                               RR on A by drawing a line segment from x up to y, if x, y € A with x R y and, most
                               important, if there is no other element z € A such that x Rz and z KR y. (So there is
                               nothing “in between” x and y.) If we adopt the convention of reading the diagram from
                               bottom to top, then it is not necessary to direct any edges.

In Fig. 7.18 we have the Hasse diagrams for the following four posets. (a) With WU = {1, 2, 3}
      EXAMPLE 7.38
                             and A = PAUL),      KR is the subset relation on A. (b) Here & is the “(exactly) divides” relation

12           385

A’
                                                                                                      2          3   5    7    11

(d)
                            Figure 7.18
                                                                             73   Partial Orders: Hasse Diagrams     359

applied to A = {1, 2, 4, 8}. (c) and (d) Here the same relation as in part (b) is applied to
                        {2, 3, 5, 7} in part (c) and to {2, 3, 5, 6, 7, 11, 12, 35, 385} in part (d). In part (c) we note
                        that a Hasse diagram can have all isolated vertices; it can also have two (or more) connected
                        pieces, as shown in part (d).

Let A = {1, 2, 3, 4, 5}. The relation R on A, defined by x R y if x < y, is a partial order.
     EXAMPLE 7.39
                        This makes A into a poset that we can denote by (A, <). If B = {1, 2, 4} Cc A, then the set
                        (BX B)NR = {d, 1), (2, 2), (4, 4, 1, 2), A, 4, @, 4)} is a partial order on B.
                          In general if & is a partial order on A, then for each subset B of A, (B X B) NR makes
                        B into a poset where the partial order on B is induced from KR.

We turn now to a special type of partial order.

Definition 7.16   If (A, A) is a poset, we say that A is totally ordered (or, linearly ordered) if forall x, y€ A
                        either x R y or y R x. In this case R is called a total order (or, a linear order).

a) On the set N, the relation & defined by x & y ifx < y is a total order.
     EXAMPLE 7.40
                          b) The subset relation applied to A = PU),         where U = {1, 2, 3}, is a partial, but not
                             total, order: {1, 2}, {1, 3} € A but we have neither {1, 2} € {1, 3} nor {1, 3} € {1, 2}.
                          c) The Hasse diagram in part (b) of Fig. 7.18 shows a total order. In Fig. 7.19(a) we have
                             the directed graph for this total order
                                                                   — alongside its Hasse diagram in part (b).

Figure 7.19

Could these notions of partial and total order ever arise in an industrial problem?
                           Say a toy manufacturer is about to market a new product and must include a set of
GF        D             instructions for its assembly. In order to assemble the new toy, there are seven tasks, denoted
                        A, B,C,   ..., G, that one must perform in the partial order given by the Hasse diagram of
                        Fig. 7.20. Here we see, for example, that all of the tasks B, A, and E must be completed
      C                 before we can work on task C. Since the set of instructions is to consist of a listing of these
      A                 tasks, numbered   1, 2, 3, ..., 7, how can the manufacturer write the listing and make sure
                        that the partial order of the Hasse diagram is maintained?
                            What we are really asking for here is whether we can take the partial order R, given by
B         E             the Hasse diagram, and find a total order J on these tasks for which R C F. The answer is
Figure 7.20             yes, and the technique that we need is known as topological sorting.
Chapter 7 Relations: The Second Time Around

Topological Sorting Algorithm
                  (for a partial order ® on a set A with |A] = n)
                       Step I: Set k = 1. Let H; be the Hasse diagram of the partial order.
                       Step 2: Select a vertex v;, in Hj, such that no (implicitly directed) edge in Hy starts
                        at vz.
                        Step 3: If k = n, the process is completed and we have a total order
                                                                      Fs Uy < Vay Se << YY
                       that contains
                                   R.

If k <n, then remove from H, the vertex v; and all (implicitly directed) edges of Hy
                   that terminate at v,. Call the result Hy,1. Increase k by 1 and return to step (2).

Here we have presented our algorithm as a precise list of instructions, with no concern
                about the particulars of the pseudocode used in earlier chapters and with no reference to its
                implementation in a particular computer language.
                   Before we apply this algorithm’ to the problem at hand, we should observe the deliberate
                use of “a” before the word “vertex’”’ in step (2). This implies that the selection need not be
                unique and that we can get several different total orders J containing R. Also, in step (3), for
                vertices v;_; where 2 <i <n, the notation v, < v;_1 is used because it is more suggestive
                of “vu; before v,_,” than is the notation v; FT v,_}.
                    In Fig. 7.21, we show the Hasse diagrams that evolve as we apply the topological sorting
                algorithm to the partial order in Fig. 7.20. Below each diagram, the total order is listed as
                it evolves.

(K=1)             H, | (k=2)            Ho]   (kK =3)       Hz]   (kK=4)       Hg | (kK=5)       He | (kK =6)      He | (kK=7)   Hy

G         F       DIG          F              G

C                    Cc                   C                  C
                                                     A                    A                  A                 A

/\                  e           e              e
                  B                 E     B                E    B              E    B            E     B           E     B           E              E

D                     F<D                G<F<D                  C<G                A<C<G             B<A<C       |E<B<A<C
                                                                                        <F<D               <F<D         <G<F<D            |<G<F<D

Figure 7.21

If the toy manufacturer writes the instructions in a list as 1-E, 2-B, 3-A, 4-C, 5-G, 6-F
                7-D, he or she will have a total order that preserves the partial order needed for correct
                assembly. This total order is one of 12 possible answers.

Here we are only concerned with applying this algorithm. Hence we are assuming that it works and we shall
                not present a proof of that fact. Furthermore, we may operate similarly with other algorithms we encounter.
                                                                             7.3 Partial Orders: Hasse Diagrams            361

As is typical in discrete and combinatorial mathematics, this algorithm provides a pro-
                cedure that reduces the size of the problem with each successive application.

The next example provides a situation where the number of distinct total orders for a
                particular partial order is determined.

Let p, g be distinct primes. In part (a) of Fig. 7.22 we have the Hasse diagram for the partial
EXAMPLE 7.41°
                order & of all positive-integer divisors of p7g. Applying the topological sorting algorithm
                to this Hasse diagram, we find in Fig. 7.22(b) the five total orders J;, where R C F;, for
                1<i <5.

p°G>pq>q>p*>p>
                                                   p*q(+)                     +,4+,4+,-,-,-
                                                                          J>:p°a>pqa>p*>p>gq>1

pq(+)                    p*(-)             Fete   Fe
                                                                          Tz: p°q>p?>pq>q>p>
                                                                              +,-,4,4,-,-
                                                                          Ty p°g>pq>p?>q>p>
                                                                              +,4,-,+,-,-
                                            1(—)                          Ts p°q>p?>pq>p>q>1
                                                                              +,-,+,-,4+,-

Figure 7.22

Now look at Fig. 7.22 again. This time focus on the three plus signs and three minus
                signs in part (a) of the figure and in the list below each total order in part (b). When we
                apply the topological sorting algorithm to the given partial order &, step (2) of the algorithm
                implies that the first divisor selected is always p*q. This accounts for the first plus sign in
                each J;, 1 <i <5. Continuing to apply the algorithm we get two more plus signs and the
                three minus signs.
                    Could there ever be more minus signs than plus signs in our corresponding list, as a total
                order is developed? For example,            could we start with +, —, —,? If so, we have failed to
                correctly apply step (2) of the topological sorting algorithm
                                                                            — we should have recognized
                pq as the unique candidate to select after p*g and p*. In fact, for0 <k <2, p*g must be
                selected before p* can be. Consequently, for each list of three plus signs and three minus
                signs, there is always at least as many plus signs as minus signs, as the list is read from
                left to right. Comparing now with the result in part (a) of Example 1.43, we see that the
                number of total orders for the given partial order is 5 = rai (73°). Further, for x > 1, the
                topological sorting algorithm can be applied to the partial order of all positive divisors of
                p"~'q to yield —(7”)          total orders, another instance where the Catalan numbers arise.

In the topological sorting algorithm, we saw how the Hasse diagram was used in deter-
                mining a total order containing a given poset (A, 2%). This algorithm now prompts us to
                examine further properties of a partial order. At the start, particular emphasis will be given

This example refers back to the optional material on Catalan numbers in Section 1.5. It may be skipped with
                no loss of continuity.
362          Chapter 7 Relations: The Second Time Around

to a vertex like the vertex v, in step (2) of the algorithm. The special property exhibited by
                             such a vertex is now considered in the following.

Definition 7.17        If (A, R) is a poset, then an element x € A is called a maximal element of A if for all
                             aéA,a#x =x Ra. Anelement y € A             is called a minimal element of A if whenever
                             be Aandb# y,thenb Ry.

If we use the contrapositive of the first statement in Definition 7.17, then we can state
                              that x(€ A) is a maximal element if foreach a € A, x Ra > x =a. Ina similar manner,
                              y € Aisa minimal element if foreachbe A,DRy>ab=y.

EXAMPLE 7.42 |          Let U = {1, 2, 3} and A = PMU).
                                a) Let & be the subset relation on A. Then U is maximal and 9 is minimal for the poset
                                    (A, ©).
                                b) For B, the collection of proper subsets of {1, 2, 3}, let R be the subset relation on B.
                                   In the poset (8, ©), the sets {1, 2}, {1, 3}, and {2, 3} are all maximal elements; ¢ is
                                    still the only minimal element.

With & the “less than or equal to” relation on the set Z, we find that (Z, <) is a poset with
      EXAMPLE 7.43
                              neither a maximal nor a minimal element. The poset (N, <), however, has minimal element
                              0 but no maximal element.

When we look back at the partial orders in parts (b), (c), and (d) of Example 7.38, the
      EXAMPLE 7.44
                              following observations come to light.

1) The partial order in part (b) has the unique maximal element 8 and the unique minimal
                                    element 1.
                                 2) Each of the four elements — 2, 3,5, and 7 — is both a maximal element and a minimal
                                     element for the poset in part (c) of Example 7.38.
                                 3) In part (d) the elements 12 and 385 are both maximal. Each of the elements 2, 3, 5,
                                     7, and 11 is a minimal element for this partial order.

Are there any conditions indicating when a poset must have a maximal or minimal
                              element?

THEOREM 7.3                   If (A, R) is a poset and A is finite, then A has both a maximal and a minimal! element.
                              Proof: Leta, ¢ A. If there is no element a € A wherea    # a, anda, & a, then a, is maximal.
                              Otherwise there is an element az € A witha) # a; anda; Ra. Ifnoelementa € A,a # a,
                              satisfies a2 R a, then az is maximal. Otherwise we can find a3 € A so that a3 # a2, a3 Fa
                              (Why?) while a; & a2 and a2 A a3. Continuing in this manner, since A       is finite, we get to
                              an element a, € A with a, Za for alla € A where a # ay, SO dy is maximal.
                                 The proof for a minimal element follows in a similar way.
                                                                         73 Partial Orders: Hasse Diagrams      363

Returning now to the topological sorting algorithm, we see that in each iteration of
                     step (2) of the algorithm, we are selecting a maximal element from the original poset (A, 2),
                     or a poset of the form (B, 2’) where @ # BC A and R’ = (B X B) NR. At least one such
                     element exists (in each iteration) by virtue of Theorem 7.3. Then in the second part of
                     step (3), if x is the maximal element selected [in step (2)], we remove from the present
                     poset all elements of the form (a, x). This results in a smaller poset.

We turn now to the study of some additional concepts involving posets.

Definition 7.18   If (A, &) is a poset, then an element x € A is called a east element if x R a for alla € A.
                     Element y € A is called a greatest element if a & y for alla € A.

Let UW = {1, 2, 3}, and let & be the subset relation.
   EXAMPLE 7.45
                       a) With A = PU), the poset (A, C) has @ as a least element and “Ul as a greatest element.
                       b) For B = the collection of nonempty subsets of °U, the poset (B, C) has Ui as a greatest
                          element. There is no least element here, but there are three minimal elements.

For the partial orders in Example 7.38, we find that
   EXAMPLE 7.46
                        1) The partial order in part (b) has a greatest element 8 and a least element 1.
                        2) There is no greatest element or least element for the poset in part (c).
                        3) No greatest element or least element exists for the partial order in part (d).

We have seen that it is possible for a poset to have several maximal and minimal elements.
                     What about least and greatest elements?

THEOREM 7.4          If the poset (A, %) has a greatest (least) element, then that element is unique.
                     Proof: Suppose that x, y € A and that both are greatest elements. Since x is a greatest
                     element, y R x. Likewise, x R y because y is a greatest element. As KR is antisymmetric, it
                     follows thatx = y.
                        The proof for the least element is similar.

Definition 7.19   Let (A, %) be a poset with B C A. Anelement x € A is called a lower bound of B ifx Rb
                     for all b € B. Likewise, an element y € A is called an upper bound of B if b&R y for all
                     be B.
                        Anelement x’ € A is called a greatest lower bound (gib) of B if it is a lower bound of B
                     and if for all other lower bounds x” of B we have x” R x’. Similarly y’ € A is a least upper
                     bound (lub) of B if it is an upper bound of B and if y’ R y” for all other upper bounds y”
                     of B.

Let U = {1, 2, 3, 4}, with A = PU), and let R be the subset relation on A. If B=
   EXAMPLE 7.47
                     {{1}, {2}, {1, 2}}, then {1, 2}, {1, 2, 3}, {1, 2, 4}, and {1, 2, 3, 4} are all upper bounds for
364           Chapter 7 Relations: The Second Time Around

B (in (A, &)), whereas {1, 2} is a least upper bound (and is in B). Meanwhile, a greatest
                              lower bound for B is %, which is not in B.

Let & be the “less than or equal to” relation for the poset (A, R).
      EXAMPLE 7.48
                                 a) If A=      R and B = (0, 1], then B has glb 0 and lub 1. Note that 0, 1 € B. For C =
                                       (O, 1], C has glb O and lub 1, and1 eC butO ¢C.
                                 b) Keeping A = R, let B = {q € Qlq? < 2}. Then B has V2 as a lub and —V/2 as a glb,
                                       and neither of these real numbers is in B.
                                 c) Now let A = Q, with B as in part (b). Here B has no lub or glb.

These examples lead us to the following result.

THEOREM 7.5                   If (A, R) is a poset and B C A, then B has at most one lub (glb).
                              Proof: We leave the proof to the reader.

We close this section with one last ordered structure.

Definition 7.20         The poset (A, &) is called a lattice if for all x, y € A the elements lub{x, y} and glb{x, y}
                              both exist in A.

ForA = Nand x, y EN, definex R y byx < y. Then lub{x, y} = max{x, y}, glb{x, y} =
      EXAMPLE 7.49
                              min{x, y}, and (N, <) is a lattice.

For the poset in Example 7.45(a), if S, T CU, with lub{S, T} = SUT                   and glb{S, T} =
      EXAMPLE 7.50
                              SOT, then (PU), C) is a lattice.

Consider the poset in Example 7.38(d). Here we find, for example, that
      EXAMPLE 7.51
                                 lub{2, 3} = 6, lub{3, 6} = 6, lub{5, 7} = 35,           lub{7, 11} = 385,    lub{11, 35} = 385,

and

glb{3, 6} = 3, glb{2, 12} = 2, glb{35, 385} = 35.

However, even though lub{2, 3} exists, there is no glb for the elements 2 and 3. In ad-
                              dition, we are also lacking (among other considerations) glb{5. 7}, glb{11, 35}, glb{3, 35},
                              and lub{3, 35}. Consequently, this partial order is not a lattice.

3. Let (A, #1), (B, R2) be two posets. On A X B, define re-
                        EXERCISES 7.3                                lation R by (a, b) R(x, y) if aR x and b Rp y. Prove that
                                                                     R is a partial order.
1. Draw the Hasse diagram for the poset (P(U), C), where
U = (1, 2, 3, 4}.                                                     4. If R,, Kz in Exercise 3 are total orders, is R a total order?

2. Let A = {1, 2, 3, 6, 9, 18}, and define R on A by x KR y if       5. Topologically sort the Hasse diagram in part (a) of Exam-
x|y. Draw the Hasse diagram for the poset (A, &).                    ple 7.38.
                                                                                            73 Partial Orders: Hasse Diagrams         365

6. For A = {a, b, c, d, e}, the Hasse diagram for the poset                a)   B=   {{1}, (2}}
(A, R) is shown in Fig. 7.23. (a) Determine the relation ma-
                                                                             b)   B=   {{1}, {2}, {3}, (1. 2}
trix for R. (b) Construct the directed graph G (on A) that is
associated with &. (c) Topologically sort the poset (A, R).                  c)   B=   {6, (1}, {2}. {1, 2})
                                                                             d)   B=   {{1}, (1, 2}, (1, 3}, (1, 2, 3h)
  7, The directed graph G forarelation® onsetA = {1, 2, 3, 4}
is shown in Fig. 7.24. (a) Verify that (A, ®) is a poset and                 e)   B=   {{i}, {2}. (3), (1. 2}, (1, 3}, (2. 3}}
find its Hasse diagram. (b) Topologically sort (A, R). (c) How
many more directed edges are needed in Fig. 7.24 to extend               18. Let = {1, 2, 3, 4, 5, 6, 7}, with A = P(A), and let    R be
(A, R) to a total order?                                                 the subset relation on A. For B = {{1}, {2}, {2, 3}} C A, deter-
                                                                         mine each of the following.
          e                                                                  a) The number of upper bounds of 8 that contain (i) three
                                                                             elements of “Ul; (ii) four elements of U; (iii) five elements
                                                                             of U
              d                                                              b) The number of upper bounds that exist for B
                                                                             c) The lub for B
b                 Cc
                                                                             d) The number of lower bounds that exist for B
                                                                             e) The glb for B
Figure 7.23                                  Figure 7.24
          a                                    .

19. Define the relation R& on the set Z by aRb ifa —bisa
    8. Prove that if a poset (A, &) has a least element, it is unique.   nonnegative even integer. Verify that R defines a partial order
                                                                         for Z. Is this partial order a total order?
    9, Prove Theorem 7.5.
                                                                         20. For X = {0, 1}, let A= X X X. Define the relation R
10. Give an example of a poset with four maximal elements but
                                                                         on A by (a, b)   R (c, d) if (i)a <c; or (ii)a =c and      b <d.
no greatest element.
                                                                         (a) Prove that & is a partial order for A. (b) Determine all min-
11. If (A, &) is a poset but not a total order, and #4 # BCA,           imal and maximal elements for this partial order. (c) Is there
does it follow that (B X B) 1 R makes B into a poset but not             a least element? Is there a greatest element? (d) Is this partial
a total order?                                                           order a total order?
12, If R is a relation on A, and G is the associated directed
                                                                         21. Let X = {0, 1, 2} and A = X X X. Define the relation R
graph, how can one recognize from G that (A, &) is a total
                                                                         on A as in Exercise 20. Answer the same questions posed in
order?
                                                                         Exercise 20 for this relation & and set A.
13. If G is the directed graph for a relation & on A, with
|A| =n, and (A, &) is a total order, how many edges (including           22, For ne Zt, let X ={0,1,2,...,2—1,n} and A=
loops) are there in G?                                                   X X X. Define the relation R on A as in Exercise 20. Remem-
                                                                         ber that each element in this total order R is an ordered pair
14, Let M(&) be the relation matrix for relation R on A, with            whose components are themselves ordered pairs. How many
|A| =n. If (A, &) is a total order, how many 1’s appear in               such elements are there in R?
M(R)?
                                                                         23. Let (A, &) be a poset. Prove or disprove each of the fol-
15. a) Describe the structure of the Hasse diagram for a totally
                                                                         lowing statements.
    ordered poset (A, &), where |A| =n > 1.
                                                                             a) If (A, &) is a lattice, then it is a total order.
       b) For a set A where |A| = n > 1, how many relations on
       A are total orders?                                                   b) If (A, &) is a total order, then it is a lattice.

16. a) For A = {a;, @,...,a,}, let (A, R) be a poset. If                 24, If (A, &) is a lattice, with A finite, prove that (A, R) has a
    M(&) is the corresponding relation matrix, how can we                greatest element and a least element.
    recognize a maximal or minimal element of the poset from
                                                                         25. For A = {a, b,c, d,e, v, w, x, y, z}, consider the poset
    M(R)?
                                                                         (A, R) whose Hasse diagram is shown in Fig. 7.25. Find
       b) How can one recognize the existence of a greatest or
                                                                             a) glb{b, c}                     b) glb{b, w}
       least element in (A, &) from the relation matrix M(R)?
                                                                             c) glb{e, x}                     d) lub{c, >}
17. Let% = {1, 2, 3, 4}, with A = PU), and letR be the sub-
set relation on A. For each of the following subsets B         (of A),       e) lub{d, x}                      f) lub{c, e}
determine the lub and gib of B.                                              g) lub{a, v}
366           Chapter 7 Relations: The Second Time Around

Is (A, &) a lattice? Is there a maximal element? a minimal              c) A ={a},4a,...,a,}      CZ*,n>1,
element? a greatest element? a least element?                              a, <@),<:--<a,,
                                                                                      B= {1,2};
                                                                           d) A = {1, 2}, B = {1, 2, 3, 4};
                                                                           e) A={1,2),     B={l,...,n},n      > 1; and
                                                                           f) A= {1,2}, B={bj,b,...,8})
                                                                                                  CZ, n>1,
                                                                           bi < by <---
                                                                                      < by.
                                                                     27.   Let p, g, 7, s be four distinct primes and m,n,   k, £€ Z*.
                                                                     How    many edges are there in the Hasse diagram of all posi-
                                                                     tive divisors of (a) p*: (b)p™: (c) p°q?s (d) pq"; (©) pq’r*;
                                                                     (F) p™g"r; (g) peq?r’s’, and (h) p"q"ris!?
                                                                     28. Find the number of ways to totally order the partial order
                                                                     of all positive-integer divisors of (a) 24; (b) 75; and (c) 1701.
               Figure 7.25
                                                                     29. Let p,q be distinct primes and k € Z*. If there are 429
26. Given partial orders (A, ®) and (B, Ff), a function f:           ways to totally order the partial order of positive-integer divi-
A—    B iscalled order-preserving if forallx, ye Ax Ry as            sors of p*g, how many positive-integer divisors are there for
f(x) £ f(y). How many such order-preserving functions are            this partial order?
there for each of the following, where ®, F both denote < (the       30. Form, n € Z*, let A be the set of all m X n (0, 1)-matrices.
usual “less than or equal to” relation)?                             Prove that the “precedes” relation of Definition 7.11 makes A
      a) A = {1, 2, 3. 4}, B= (1, 2}                                 into a poset.
      b) A={l,...,n},221,        B= (1, 2}

7.4
      Equivalence Relations and Partitions
                                As we noted earlier in Definition 7.7, a relation & on a set A is an equivalence relation
                                if it is reflexive, symmetric, and transitive. For any set A # 9, the relation of equality is
                                an equivalence relation on A, where two elements of A are related if they are identical;
                                equality thus establishes the property of “sameness” among the elements of A.
                                    If we consider the relation & on Z defined by x R y ifx — y is a multiple of 2, then R
                                is an equivalence relation on Z where all even integers are related, as are all odd integers.
                                Here, for example,    we do not have 4 = 8, but we do have 4 & 8, for we no longer care
                                about the size of a number but are concerned with only two properties: “evenness” and
                                “oddness.” This relation splits Z into two subsets consisting of the odd and even integers:
                                Z={...,—-3,-1,1,3,...}U{..., -4, -2,0,2,4,...}. This splitting up of Z is an
                                example of a partition, a concept closely related to the equivalence relation. In this section
                                we investigate this relationship and see how it helps us count the number of equivalence
                                relations on a finite set.

Definition 7.21          Given a set A and index set J, let @ A A; C A for eachi € J. Then {A;};<; is a partition of
                                Aif
                                    a) A= U4           and = b) A;          Aj = G,       foralli,
                                                                                             j ¢ J wherei # j.
                                            re

Each subset A; is called a cell or block of the partition.

EXAMPLE      7,52         If A = {1, 2, 3, ..., 10}, then each of the following determines a partition of A:

a) A; = {1, 2,3, 4, 5}, Ao = {6, 7, 8, 9, 10}
                                                                     7.4 Equivalence Relations and Partitions        367

b) A; = {1, 2, 3}, Ao = {4, 6, 7, 9}, As = {5, 8, 10}
                        ce) A, = {i,i +5},
                                        1 <i<5
                      In these three examples we note how each element of A belongs to exactly one cell in each
                      partition.

| EXAMPLE 7.53        Let A = R and, for each i € Z, let A; = [i, i + 1). Then {A;}j<z is a partition of R.

Now just how do partitions come into play with equivalence relations?

Definition 7.22   Let & be an equivalence relation on a set A. For each x € A, the equivalence class of x,
                      denoted [x], is defined by [x] = {y € Aly R x}.

Define the relation R on Z by x KR y if 4|(x — y). Since ®R is reflexive, symmetric, and
   EXAMPLE 7.54
                      transitive, it is an equivalence relation and we find that

]={..., -8, 4,0, 4, 8, 12, ...} = {4klk eZ}
                                            ]={...,—-7,  -3, 1,5, 9, 13,...}= (4k 4 lk eZ}
                                         [2] ={...,  -6, —2, 2, 6, 10, 14,...}= {4k + 21k eZ}
                                            ]=(...,—-5, -1, 3, 7, 1, 15,...} = {4k + 3]k € Z}.
                          But what about [7], where # is an integer other than 0, 1, 2, or 3? For example, what
                      is [6]? We claim that [6] = [2] and to prove this we use Definition 3.2 (for the equality of
                      sets) as follows. If x € [6], then from Definition 7.22 we know that x A 6. Here this means
                      that4 divides (x
                                     — 6), so x — 6 = 4k for some
                                                               k € Z. But then x —6 = 4k                        > x -2=
                      4(k + 1) => 4 divides (x — 2) = x R2      => x € [2], so [6] C [2]. For the opposite inclusion
                      start with an element y in [2]. Then y € [2] > y R2 = 4 divides (y — 2) > y — 2 = 41 for
                      some/eZ=> y—6=4( — 1), where! -1e€Z=>                   4 divides y-6>5 yR6>          ye [6],
                      so [2] C [6]. From the two inclusions it now follows that [6] = [2], as claimed.
                          Further, we also find, for example, that [2] = [—2] = [—6], [51] = [3], and [17] = [1].
                      Most important, {[0], [1], [2], [3]} provides a partition of Z.
                          [Note: Here the index set for the partition is implicit. If, for instance, we let Ag = [0],
                      A; = [1], A2 = [2], and A3 = [3], then one possible index set J (as in Definition 7.21) is
                      {0, 1, 2, 3}. When a collection of sets is called a partition (of a given set) but no index set
                      is specified, the reader should realize that the situation is like the one given here — where
                      the index set is implicit. ]

Define the relation R on the set Z by a R b if a? = b? (or,a = +b). Foralla € Z, we have
   EXAMPLE 7.55
                      a’ =a*—so a Ra and & is reflexive. Should a, b € Z with a R b, then a? = b* and it
                      follows that b? = a?, or b Ra. Consequently, relation & is symmetric. Finally, suppose
                      that
                       a, b,c € Z withaRb and
                                            b Rc. Then a? = b? and b* = c*, so a* =c* andaRe.
                      This makes the given relation transitive. Having established the three needed properties,
                      we now know that & is an equivalence relation.
                          What can we say about the corresponding partition of Z?
368         Chapter 7 Relations: The Second Time Around

Here one finds    that [0] = {0}, [1] = [—1]       = {-1,   1}, [2] = [—2]   = {-2, 2}, and,
                                                                                                                         in gen-
                            eral, for each n € Z*, [n] = [—n] = {—n, n}. Furthermore, we have the partition

Z = Uni = Umm)= {0} U ( Ui-n.m)) = {0} U ( U (-n.n)).
                                           n=0         neN                  n=]

These examples lead us to the following general situation.

THEOREM 7.6                 If R is an equivalence relation on a set A, and x, y € A, then (a) x € [x]; (b) x B y if and
                            only if [x] = [y]; and (c) [x] = [y] or [x] M1 [y] = B.
                             Proof:
                                a) This result follows from the reflexive property of R.
                               b) The proof here is somewhat reminiscent of what was done in Example 7.54.
                                      Ifx R y,letw € [x]. Then w KR x and because
                                                                               & is transitive, w R y. Hence w é€ [y]
                                  and [x] C [y]. With & symmetric, x Ry > y Rx. So if te [y], then t R y and by
                                  the transitive property, t R x. Hencet € [x] and [y] C [x]. Consequently, [x] = [y].
                                      Conversely, let [x] = [y]. Since x € [x] by part (a), then x € [y] or x R y.
                               c) This property tells us that two equivalence classes can be related in only one of two
                                  possible ways. Either they are identical or they are disjoint.
                                      We assume that [x] # [y] and show how it then follows that [x] M [y] = @. If
                                  [x] O Ly] # Y, then let v € A with v € [x] and v € [y]. Thenuv Rx, vu R y, and, since
                                  R is symmetric, x R v. Now (x Rv andvuRk y) > x R y, by the transitive property.
                                  Also x R y => [x] = [y] by part (b). This contradicts the assumption that [x] # [y],
                                   sO we reject the supposition that [x]          Ly] # 9, and the result follows.

Note that if & is an equivalence relation on A, then by parts (a) and (c) of Theorem 7.6
                             the distinct equivalence classes determined by & provide us with a partition of A.

a) If A ={1,2,3,4,5} and R= {d1, 1), (2, 2), (2, 3), (3, 2), GB. 3), (4, 4. (4.5),
      EXAMPLE 7.56
                                  (5, 4), (5, 5)}, then& is an equivalence relation on A. Here [1] = {1}, [2] = {2, 3} =
                                  [3]. [4] = {4, 5} = [S], and A = [1] U [2] U [4] with [1] 9 [2] = @, [1] N [4] = @, and
                                   [2] M [4] = @. So {[1], [2], [4]} determines a partition of A.
                               b) Consider part (d) of Example 7.16 once again. We have A = {1, 2, 3, 4, 5, 6, 7}, B =
                                  {x, y, z}, and f: A > B is the onto function

f ={d, x), (2, 2). 3, x), 4, y), 6, 2), (6, y), 7, x)}.
                                   The relation & defined on A by a R bif f(a) = f(b) was shown to be an equivalence
                                   relation. Here
                                                             f-'@) = 01, 3,7) = 01) © [3] = (7),
                                                             f7'(y) = {4, 6} = [4] (= [6), and
                                                             f-'(@ = {2,5} = [21 © [5).
                                  With A = [1] U[4] U [2] = f-'(x) U fo ' Gy) U £7'(2), we see that
                                   (fl),      f-'(),      f—'(z)} determines a partition of A.
                                      In fact, for any nonempty sets A, B, if f: A > B is an onto function, then A =
                                   U,<e f-'(b) and { f~!(b)|b € B} provides us with a partition of A.
                                                                        7.4 Equivalence Relations and Partitions   369

In the programming language C++ a nonexecutable specification statement called the union
  EXAMPLE 7.57                                      .     .      .
                  construct allows two or more variables in a given program to refer to the same memory
                  location.
                      For example, within a program the statements

union
                                                             {
                                                                  int a;
                                                                  int   c;
                                                                  int p;
                                                             };
                                                             union
                                                             {
                                                                  int   up;
                                                                  int   down;
                                                             }i
                  inform    the C++     compiler that the integer variables a, c, and          p will share one memory
                  location while the integer variables up and down will share another. Here the set of all
                  program variables is partitioned by the equivalence relation R, where v,; R vp if v; and v2
                  are program variables that share the same memory location.

EXAMPLE 7.58   Having seen examples of how an equivalence relation induces a partition of a set, we now
              .   go backward. If an equivalence relation & on A = {1, 2, 3, 4, 5, 6, 7} induces the partition
                  A = {1, 2} U {3} U {4, 5, 7} U {6}, what is R?
                    Consider the cell {1, 2} of the partition. This subset implies that [1] = {1, 2} = [2], andso
                  (1, 1), (2, 2), (1, 2), (2, 1) eR. (The first two ordered pairs are necessary for the reflexive
                  property of &; the others preserve symmetry.)
                      In like manner, the cell {4, 5, 7} implies that under &, [4] = [5] = [7] = {4, 5, 7} and
                  that, as an equivalence relation, R must contain {4, 5, 7} x {4, 5, 7}. In fact,

AR = ({1, 2} * (1, 2) U C3} X (3) U C4, 5, 7} X (4, 5, 7)) U 6} X {6}),
                  and

(Al = 247434
                                                               P= 15.

The results in Examples 7.54, 7.55, 7.56, and 7.58 lead us to the following.

THEOREM 7.7       If A is a set, then
                    a) any equivalence relation & on A induces a partition of A, and
                    b) any partition of A gives rise to an equivalence relation R on A.
                  Proof: Part (a) follows from parts (a) and (c) of Theorem 7.6. For part (b), given a partition
                  {A, }ic, of A, define relation 2 on A by x R y, if. x and y are in the same cell of the partition.
                  We leave to the reader the details of verifying that & is an equivalence relation.

On the basis of this theorem and the examples we have examined, we state the next
                  result. A proof for it is outlined in Exercise 16 at the end of the section.
370               Chapter 7 Relations: The Second Time Around

THEOREM 7.8                                For any set A, there is a one-to-one correspondence between the set of equivalence relations
                                           on A and the set of partitions of A.

We are primarily concerned with using this result for finite sets.

EXAMPLE 7.59                         a) If A == {1, 2, 3, 4, 5, 6}, how many relations
                                                                                           ‘anc on A are equivalence
                                                                                                             ‘            Onc?
                                                                                                                     relations’
                                                     We solve this problem by counting the partitions of A, realizing that a partition
                                                    of A is a distribution of the (distinct) elements                 of A into identical containers, with
                                                     no container left empty. From Section 5.3 we know, for example, that there are
                                                     S(6, 2) partitions of A into two identical nonempty containers. Using the Stirling
                                                     numbers of the second kind, as the number of containers varies from 1 to 6, we have
                                                        e S(6, i) = 203 different partitions of A. Consequently, there are 203 equivalence
                                                    relations on A.
                                              b) How many of the equivalence relations in part (a) satisfy 1, 2 € [4]?
                                                           Identifying 1, 2, and 4 as the “same” element under these equivalence relations, we
                                                    countas in part (a) forthe set B = {1, 3, 5, 6} and find that there                       are vt           S(4, 7) = 15
                                                    equivalence relations on A for which [1] = [2] = [4].

We close by noting that if A is a finite set with |A| =”, then for all n <r                                  <n’, there is
                                           an equivalence relation & on A with || = r if and only if there exist n,, 12, . 12,                                       mee Zt
                                           with )°*_,n; =n and )°*_, n?             =f.

6. For A = R®, define R on A by (x1, yi) R (Xo, yo) if
                                   913 ah            ee                              Xy    = X2.

1. Determine whether each of the following collections of sets                             a) Verify that & is an equivalence relation on A.
is a partition for the given set A. If the collection is not a parti-                       b) Describe geometrically the equivalence classes and par-
tion, explain why it fails to be.                                                           tition of A induced by &.
      a) A = {1, 2, 3,4,5,6,7, 8};                    A,
                                                       = {4, 5, 6},
                                                                                          7, LetA = {1, 2, 3, 4, 5} X {1, 2, 3, 4, 5}, and define
                                                                                                                                              R on A
      Az = {1, 8}, A3 = (2, 3, 7}.
                                                                                     by (41, y1) RK (Xa, yo) thx + yy = x2 + yr.
      b)   A   = {a, b,   Cc, d,   é,   ff. 8.h};         A,   = {d,   e},

Az = {a,c, d}, Az = {fh}, Aq = (8, gl.                                                 a) Verify that &% is an equivalence relation on A.

2. Let A = {1, 2, 3, 4, 5, 6, 7, 8}. In how many ways can we                              b)     Determine the equivalence classes [(1, 3)}, [(2, 4)], and
partition A as A; U A; U A; with                                                             [(, 1)].
      a) 1,2€¢A,,         3,4€A>,           and     5,6,7€ A3?                               c) Determine the partition of A induced by &.

b) 1,2€A);,         3,4€A2,            5,6€A3,            and    |A;| = 3?       8. IfA = {1, 2, 3, 4, 5, 6, 7}, define
                                                                                                                           R on A by (x, y) ERif
      ce) 1,2€A),         3,46 A,           and     5,6€ A3?                         xX — y isa multiple of 3.
3. If A = {1, 2, 3,4, 5} and &                     is the equivalence relation             a) Show that & is an equivalence relation on A.
on A that induces the partition A = {1, 2} U (3, 4} U {5}, what
                                                                                            b) Determine the equivalence classes and partition of A
is AR?
                                                                                            induced by &.
  4, ForA = {1, 2, 3, 4, 5, 6},& = {q, 1), C1. 2), (2, 1), (2, 2),
(3, 3), (4, 4), (4, 5), (5, 4), (5, 5), (6, 6)}
                                              1s an equivalencere-                        9, For A = {(—4, —20), (—3, —9), (—2, —4), (-1, 11),
lation on A. (a) What are [1], [2], and [3] under this equivalence                   (-1,        —3),   a, 2),   qd, 5),   (2,   10),   (2,   14),   (3, 6),    (4, 8),   (4,   12)

relation? (b) What partition of A does R induce?                                     define the relation ® on A by (a, b) R (c, d) if ad = be.

5. If A = A; U A2 UA3,                 where A; = {1, 2}, Ao = {2, 3, 4},                  a) Verify that % is an equivalence relation on A.
and A; = {5}, define relation R on A by x & yif x and y are in                              b) Find the equivalence classes [(2, 14)], [(—3, —9)], and
the same subset A,, for 1 <i <3. Is R an equivalence relation?                              [(4, 8)].
                                                                   75      Finite State Machines: The Minimization Process          371

c) How many cells are there in the partition of A induced      alence relations where        v, w € [x]; (g) equivalence     relations
   by R?                                                          where w € [x] and y € [z]; and (h) equivalence relations where

10. Let A be a nonempty set and fix the set B, where B C A.       w € [x], y € [z], and [x] $ [z].
Define the relation A on P(A) by X RY, for X, Y CA, if            13. If |A| = 30 and the equivalence relation ® on A partitions
BOX=BNY.                                                          A into (disjoint) equivalence classes A;, Az, and A3, where
   a) Verify that # is an equivalence relation on P(A).           |Ai| = |A2| = |A3|, what is ||?
   b) If A = {1, 2,3} and B = {1, 2}, find the partition of       14. Let A = {1, 2, 3, 4, 5, 6, 7}. For each of the following val-
   PA) induced by KR.                                             ues of r, determine an equivalence relation & on A with |R| =
   c) IfA = {1, 2, 3, 4, 5} and B = {1, 2, 3}, find LX] ifX =     r, or explain why no such relation exists. (a) r = 6; (b) r = 7;
   {1, 3, 5}.                                                     (c) r=8; (d) r=9; (ce) r= 11; (f) r = 22; (g) r = 23;
                                                                  (h) r = 30; G)r = 31.
   d) For A = {1, 2, 3, 4, 5} and B = {1, 2, 3}, how many
   equivalence classes are in the partition induced by R?         15. Provide      the details for the proof of part (b)} of Theo-
                                                                  rem 7.7.
fl. How     many   of the equivalence relations on A=
{a, b,c, d, e, f} have (a) exactly two equivalence classes of     16. For any set A 4 @, let P(A) denote the set of all partitions
size 3? (b) exactly one equivalence class of size 3? (c) one      of A, and let E(A)       denote the set of all equivalence relations
equivalence class of size 4? (d) at least one equivalence class   on A. Define the function f: E(A} > P(A) as follows: If R
with three or more elements?                                      is an equivalence relation on A, then f (&) is the partition of
12, Let A = {v, w, x, y. z}. Determine the number of relations    A induced by &. Prove that f is one-to-one and onto, thus
on A that are (a) reflexive and symmetric; (b) equivalence        establishing Theorem 7.8.
relations; (c) reflexive and symmetric but not transitive; (d)    17,     Let   f: A—   B. If {B,, Bo, Bs,...,    B,}   is a partition of
equivalence relations that determine exactly two equivalence      B, prove that {f—'(B,)|1 <i <n, f7'(B,) # 9} is a partition
classes; (e) equivalence relations where w € [x]; (f) equiv-      of A.

7.5
             Finite State Machines:
           The Minimization Process
                             In Section 6.3 we encountered two finite state machines that performed the same task but
                             had different numbers of internal states. (See Figs. 6.9 and 6.10.) The machine with the
                             larger number of internal states contains redundant states —- states that can be eliminated
                             because other states will perform their functions. Since minimization of the number of
                             states in a machine reduces its complexity and cost, we seek a process for transforming a
                             given machine into one that has no redundant internal states. This process is known as the
                             minimization process, and its development relies on the concepts of equivalence relation
                             and partition.
                                 Starting with a given finite state machine        M    = (S, ,    ©, v, w), we define the relation
                             E, on S by s; E; 52 if w(s;, x) = w{s2, x), for all x € F. This relation E is an equivalence
                             relation on S, and it partitions S into subsets such that two states are in the same subset if
                             they produce the same output for each x € J. Here the states s;, s2 are called /-equivalent.
                                 For each k € Z*, we say that the states s,, s2 are k-equivalent if w(s,, x) = w(s2, x) for
                             all x < $*, Here w is the extension of the given output function to § X $*. The relation of k-
                             equivalence is also an equivalence relation on S; it partitions S into subsets of k-equivalent
                             states. We write s; Ex s2 to denote that s; and sz are k-equivalent.
                                 Finally, if s), 52 € S and s), sp are k-equivalent for all k > 1, then we call s; and s5
                             equivalent and write s; E s2. When this happens, we find that if we keep s; in our machine,
                             then s2 will be redundant      and can be removed.         Hence     our objective is to determine the
                             partition of S induced by E and to select one state for each equivalence class. Then we shall
                             have a minimal realization of the given machine.
372         Chapter 7 Relations: The Second Time Around

To accomplish this, let us start with the following observations.

a) If two states in a machine are not 2-equivalent, could they possibly be 3-equivalent?
                                    (or k-equivalent, for k > 4?)
                                        The answer is no. If 5), 52 € § and s; E> s2 (that is, s; and s> are not 2-equivalent),
                                    then there      is at least one        string xy € §*     such that w(s), xy) = vv» FW |W? =
                                    w(s2, xy), Where        v1, v2, w;, W2 € O. So with regard to E3, we find that s, B; sz be-
                                    cause for any z € #, w(s), xyz) = vjv2U3 A Wi w2W3 = w(s2, xyz).
                                       In general, to find states that are (k + 1)-equivalent, we look at states that are
                                    k-equivalent.
                                 b) Now     suppose      that s;, 5: € S and s; Ey s>. We         wish    to determine     whether $1 E3 5p.
                                    That    is,   does    wW(S], X1X2X3)      = w(5,   X1X2x3)    for    all   strings   X1X2X3   € $39?    Con-
                                    sider what happens. First we get w(s;, x1) = w(s2, x;), because Sy Ep 82 => 81 Ey 5.
                                    Then there is a transition to the states v(s;,x,) and v(s2, X;). Consequently,
                                    ($1, X1X2X3) = W(S2, X1X2X3)               if w(v(s1, x1), X2%3) = w(v(s2, x1), X2x3)              [that is,
                                   if v(sy, X;) Ey v(s2, x))].
                                        In general, for s;, s2, € S, where s; Ex s2, we find that s; Ex, s2 if (and only if)
                                    v(s}, X) Ex v{so, x) forall x € F.
                            With these observations to guide us, we now present an algorithm for the minimization of
                            a finite state machine M.

Step 1: Set k = 1. We determine the states that are 1-equivalent by examining the
                                   rows in the state table for M. For s;, s2 € S it follows that s; E; 5. when s;, 52 have
                                   the same output rows.
                                      Let P; be the partition of S induced by E.
                                   Step 2: Haying determined Py, we obtain P,.1 by noting that if s; EB, so, then
                                   5; Ex41 $2 when v(s;, x) EB v(sz, x) for all x € $. We have s; Ex so if s;, 82 are
                                   in the same cell of the partition P;. Likewise, v(s,, x) Ex v(s2, x) for each x € §,
                                   if v(s;, x) and v(s2, x) are in the same cell of the partition P;. In this way Pr+; is
                                   obtained from P;.
                                   Step 3: If Pyi; = Py, the process is complete. We select one state from each equiv-
                                   alence class and these states yield a minimal realization of M.
                                      Tf Pia      % Pe, we increase k by 1 and return to step (2).

We illustrate the algorithm in the following example.

With # = © = {0, 1}, let M be given by the state table shown in Table 7.1. Looking at the
      EXAMPLE 7.60
                            output rows, we see that s3 and s4 are 1-equivalent, as are sz, ss, and sg. Here E, partitions
                            S as follows:

Py: {sy}, {82, 85, 86}, {83, $4}.
                            For each s € S and each k € Z*, s Ex s, so as we continue this process to determine P,, we
                            shall not concern ourselves with equivalence classes of only one state.
                                Since s3 E; s4, there is a chance that we could have 5; Ey s4. Here v(s3, 0) = $9,
                            v(s4, 0) = ss with sz E; s5,and v(s3, 1) = 54, v(s4, 1) = 53 with s4 E, 53. Hence v(s3, X) Ey
                            v(s4, x), for all x € £, and s3 Ey sq. Similarly, v(s2, 0) = s5, v(s5, 0) = s2 with s5 E, 59,
                           and    v(s2,    1) = 59, v(ss,    1) = s5   with    s> E; ss.   Thus   57, Ey $5, Finally,     v(s5, 0)   = s>   and
                                                                              75       Finite State Machines: The Minimization Process                 373

v(s6, 0) = 51, but sz B, 5, so s5 By s6. (Why                                  don’t we investigate the possibility of
                     82 E> 56?) Equivalence relation E, partitions S as follows:

P2: {81}, {82, 85}, {83, Sa}, {S6}-
                         Since P, # P|, we continue the process to get P3. In determining whether s2 E3 55, we
                     see that v(s2, 0) = 55, v(ss, 0) = 8, and s5 Ey 59. Also, vise, 1) = 89, v(s5, 1) = 55, and
                     52 E> s5. With v(s2, x) Ex v(ss, x) forallx € %, we have sz E3 55. For 53, 54, (v(s3, 0) = 52)
                     E> (s5 = v(s4, 0))      and         (v(s3, 1) = 54) Es (83 = v(sq4, 1)), so s3 E3 s4 and E3                             induces    the
                     partition P3: {51}, {82, Ss}, {83, 54}, {6}.

Table 7.1                                                        Table 7.2

v               @                                          y             @

0          1;0                  1                           0         1/0 #1

S]         S54        53      0            ]                     Sy    S53       S3   0        1
                                        S2         SS         AY)     1            0                     S52   S52       S2   ]        0

83 | Ss.              s4|O0                O                     $3    |S        8    |0       O
                                        S4         S55        53      0            0                     S6    5]        S6   ]        0

S5         52         S5      1            0
                                        56         Sy         56      1            0

Now P3 = P» so the process is completed, as indicated in step (3) of the algorithm. We
                     find that ss; and s4 may be regarded as redundant states. Removing them from the table, and
                     replacing all further occurrences of them by s2 and s3, respectively, we arrive at Table 7.2.
                     This is a minimal machine that performs the same tasks as the machine given in Table 7.1.
                         If we do not want states that skip a subscript, we can always relabel the states in this
                     minimal machine. Here we would have s), 52, 53, 54 (= Sg), but this s4 is not the same s4
                     we started with in Table 7.1.

You may be wondering how we knew that we could stop the process when P; = P. For
                     after all, couldn’t it happen that perhaps Py # P3, or that Py = P; but Ps # P,? To prove
                     that this never occurs, we define the following idea.

Definition 7.23   If P;, P) are partitions of aset A, then P) is called a refinement of P,, and we write P, < P,,
                     if every cell of P, is contained inacell of P;. When P; < P; and P, # P, we write P, < P).
                     This occurs when at least one cell in P, is properly contained in a cell in P).

In the minimization process of Example 7.60, we had P3 = P, < P;. Whenever we
                     apply   the algorithm,        as we get              Py;          from   Py, we     always      find that P,,;        < Py, because
                     (k + 1)-equivalence implies k-equivalence. So each successive partition refines the pre-
                     ceding partition.

THEOREM 7.9          In applying the minimization process, if k > 1 and Py and P;,.) are partitions with Py4; =
                     P,, then P,.,   = P, forallr >k +1,
                     Proof: If not, let r (> k + 1) be the smallest subscript such that P,,; # P,. Then P,., < P,,
                     so there exist 51, 5. € S with s, E, s> but s, F,4, 52. But s; E, 82 > v(s1, x) E,-) v(s2, x),
374        Chapter 7 Relations: The Second Time Around

for all x € §, and with P, = P,_1, we then find that v(s;, x) E, v(s2, x), for all x € F, so
                            s, E,+1 82. Consequently,         P,4;   = P,.

We close this section with the following related idea. Let M be a finite state machine
                            with 5), 5. € S, and s;, sy not equivalent.             If s, F; sz, then these states produce different
                            output rows in the state table for M. In this case it is easy to find an x <€ F such that
                            w(s1, xX) # w(s2, x), and this distinguishes these nonequivalent states. Otherwise, s; and
                            s2 produce the same output rows in the table but there is a smallest integer k > 1 such that
                            Ss) Ex sz buts;     Fea     s2. Now if we are to distinguish these states, we need to find a string x =
                            xyxX2+ + XEXE41 € GH!           such that w(s1, x) # w(s2, x), even though w(sy, x1x2 +++ xX~) =
                            w(52, XX. +++ Xz). Such a string x is called a distinguishing string for the states s; and 59.
                            There may be more than one such string, but each has the same (minimal) length k + 1.
                                Before we try to find a distinguishing string for two nonequivalent states in a specific
                            finite state machine, let us examine the major idea at play here. So suppose that 5}, s2 € $
                            and that for some (fixed) k € Z* we have sj Ex 52 but s; F,,; 52. What can we conclude?
                                We find that

51 Buys 82> Any € F [v(s1, x1) Fy v(so, x1]
                                                    => Fx, € $ Ax. € F [v(v(s1, x1), X2) Bey v(v(s2, 41), X2)),
                                               or     = Ax, € F Ax. € F [v(s1, x1x2) By_1 v(s2, x1%2)]
                                                    => x1, x2, x3 € F [v(s1, X1X2x3) Bg_2 v(52, X1%2.%3)]

=> Fx, x9,..., 4;        © F [v(s1, x1X2 ~~ = xj) Fey 1_j v(s2, X1X2 +++ x]

=> Fx),20,....X¢ € F [v(sy, xpx2 ++ XQ) By v(s2, xpx2 +++ Xe)].

This     last   statement     about   the   states   v(s1, x)x2---x,),   V(S2, X1X2-- + XZ)   not   being
                            1-equivalent implies that we can find x,,; € # where

@(V(S], X1XQ-°+ + XK), Xe-n) F O(V(S2, XpX2 + ++ XE), Kear).                      (1)

That is, these single output symbols from © are different.
                               The result denoted by Eq. (1) also implies that

w(Sy, X) = W(Sy, XX. ++            XEXp41) F W(S2, X1XQ ++ + NEXK41) = O(52, X).
                            In this case we have two output strings of length k + 1 that agree for the first k symbols
                            and differ in the (k + 1)st symbol.

We shall use the preceding observations, together with the partitions P|, P2,..., Px,
                            P,+, of the minimization process, in order to deal with the following example.

From Example 7.60 we have the partitions shown below. Here sz E; s6, but 52 E> s5. So we
      EXAMPLE 7.61          seek an input string x of length 2 such that w(s2, x) # w(s6, x).
                                1) We start at P), where for sz, sg, we find that v(s2, 0) = s5 and v(sg, 0) = s; are in
                                      different cells of P; — that is,

85 = v(S2, 0) KF, v(s6, 0) = 51.
                                                            75            Finite State Machines: The Minimization Process                  375

[The input 0 and output | (for @(s2, 0) = 1 = w(s¢, 0)) provide the labels for the
                     arrows going from the cells of P, to those of P}.]

P,:     {s,},               {Sp,     Ss I,    {53, Sy},           {56}

0, 1                                    0,1

2) Working with s; and ss in the partition P; we see that

w(v(S2, 0), 0) = w(s5, 0) = 1 #0 = w(51, 0) = w(v(56, 9), 0).
                  3) Hence x = 00 is a minimal distinguishing string for s2 and s¢ because w(s2, 00) =
                     11 4 10 = w(s¢, 00).

EXAMPLE 7.62   Applying the minimization process to the machine given by the state table in part (a) of
               Table 7.3, we obtain the partitions in part (b) of the table. (Here Py = P3.) We find that the
               states s; and s4 are 2-equivalent but not 3-equivalent. To construct a minimal distinguishing
               string for these two states, we proceed as follows:

1) Since s, F3 54, we use partitions P3 and P» to find x; € £ (namely, x, = 1) so that

(v(s1, 1) = 52) By (85 = v(s4, 1).
                  2) Then v(s), 1) By v(sg, 1) > Sx2 € F (here.x2 = 1) with (v(s;, 1), 1) FB, (v(sa, 1), 1),
                     or v(sy, 11) KB, v(s4, 11). We used the partitions P, and P, to obtain x2 = 1.
                  3) Now   we use the partition P,; where we find that for x3 = 1 € §,

w(v(s;,       11), 1) =O                       4 1 = @(v(s4,               11), 1)        or
                                                  w(s,, 111) = 100 ¥ 101 = w(sq, 111).

In part (b) of Table 7.3, we see how we arrived at the minimal distinguishing string
               x = 111 for these states. (Also note how this part of the table indicates that 11 is a minimal
               distinguishing string for the states s2 and ss, which are 1-equivalent but not 2-equivalent.)

Table 7,3

v                 @                    P,:    {5},      S3}, {55}, {Sy}, {Ss}
                                                                      1                        LT                             \1
                                                          CO

St    S4        S52                                P,:    {S1,      S32,   54},     {S>},    {55}
                                                          O°

KF Oe
                                                                    Dore

52    S5        59
                                                          co

S53   S4        892
                                                                                     P,:    {51,53, Sq},              {55,55}
                                                          ooo

S4    S53       S5

S55   S92       S53
                                                                                               1,1!                        } 1,0
                                 (a)                                                 (b)
376                Chapter 7 Relations: The Second Time Around

A great deal more can be done with finite state machines. Among other omissions, we
                                             have avoided offering any rigorous explanation or proof of why the minimization process
                                             works. The interested reader should consult the chapter references for more on this topic.

2. For the machine in Table 7.4(c), find a (minimal) distinguish-
                                  Ad ee                                         ing string for each given pair of states: (a) 51, 55; (b) 52, 533
                                                                                (C) 55, 57.
1. Apply the minimization process to each machine in Table 7.4.
                                                                                3. Let M be the finite state machine given in the state diagram
  Table 7.4                                                                     shown in Fig. 7.26.
                                                                                    a) Minimize machine M.
                                 oO)
                                                                                   b) Find a (minimal) distinguishing string for each given pair
             0        1    Q           |                                           of states: (1) 53, 563 (11) 53, 84; and (ili) $1, 5.

Sy     S4      Ss]   0           1
      s. | 53        83}    1          O
      s3     |S;     Sa}    1          O
      54     S|      SZ    0           1
      S5     S3      S3     1          0

(a)

@

0        ]     0          1

s} | Se        83 | 0   O
      $2 | S5        84   |Q   1
      $3 | 8         So]   1]
      S54    54      53     1          0
      $5}    82      Sg     |O         1
      S56    S4      S56    0          0
                                                                                          Figure 7.26
      (b)

@

0       1     QO         1

Ss, | S        83 | 0            O
      S52    53      S|     0          O
      $3 |    8S     S,|O              O
      S4     S57     S4     0          0
      Ss | S56       S7 | 0            O
      S56    S55     S52    1          0
      S7     S4      S|     0          0

(9)

7.6
            Summary and Historical Review
                                             Once again the relation concept surfaces. In Chapter 5 this idea was introduced as a gen-
                                             eralization of the function. Here in Chapter 7 we concentrated on relations and the special
                                            properties: reflexive, symmetric, antisymmetric, and transitive. As a result we focused on
                                             two special kinds of relations: partial orders and equivalence relations.
                                                    76   Summary and Historical Review       377

A relation &   on a set A is a partial order, making       A into a poset, if &% is reflexive,
antisymmetric, and transitive. Such a relation generalizes the familiar “less than or equal
to” relation on the real numbers. Try to imagine calculus, or even elementary algebra,
without it! Or take a simple computer program and see what happens if the program is
entered into the computer haphazardly, permuting the order of the statements. Order is
with us wherever we turn. We have grown so accustomed to it that we sometimes take it
for granted. The origins of the subject of partially ordered sets (and lattices) came about
during the nineteenth century in the work of George Boole (1815-1864), Richard Dedekind
(1831-1916), Charles Sanders Peirce (1839-1914), and Ernst Schréder (1841-1902). The
work of Garrett Birkhoff (1911-1996) in the 1930s, however, is where the initial work on
partially ordered sets and lattices was developed to the point where these areas emerged as
subjects in their own right.
    For a finite poset, the Hasse diagram, a special type of directed graph, provides a pictorial
representation of the order defined by the poset; it also proves useful when a total order,
including the given partial order, is needed. These diagrams are named for the German
number theorist Helmut Hasse (1898-1979). He introduced them in his textbook Héhere
Algebra (published in 1926) as an aid in the study of the solutions of polynomial equations.
The method we employed to derive a total order from a partial order is called topological
sorting and it is used in the solution of PERT (Program Evaluation and Review Technique)
networks. As mentioned earlier, this method was developed and first used by the U.S. Navy.
    Although the equivalence relation differs from the partial order in only one property,
it is quite different in structure and application. We make no attempt to trace the origin
of the equivalence relation, but the ideas behind the reflexive, symmetric, and transitive
properties can be found in / Principii di Geometria (1889), the work of the Italian mathe-
matician Giuseppe Peano (1858-1932). The work of Carl! Friedrich Gauss (1777-1855) on
congruence, which he developed in the 1790s, also utilizes these ideas in spirit, if not in
name.

Giuseppe Peano (1858-1932)                        Carl Friedrich Gauss (1777-1855)

Basically, an equivalence relation & on a set A generalizes equality; it induces a char-
acteristic of “sameness” among the elements of A. This “‘sameness” notion then causes the
set A to be partitioned into subsets called equivalence classes. Conversely, we find that a
partition of a set A induces an equivalence relation on A. The partition of a set arises in
many places in mathematics and computer science. In computer science many searching
378               Chapter 7 Relations: The Second Time Around

algorithms rely on a technique that successively reduces the size of a given set A that is
                                  being searched. By partitioning A into smaller and smaller subsets, we apply the searching
                                  procedure in a more efficient manner. Each successive partition refines its predecessor, the
                                  key needed, for example, in the minimization process for finite state machines.
                                      Throughout the chapter we emphasized the interplay between relations, directed graphs,
                                  and (0, 1)-matrices. These matrices provide a rectangular array of information about a
                                  relation, or graph, and prove useful in certain calculations. Storing information like this, in
                                  rectangular arrays and in consecutive memory locations, has been practiced in computer
                                  science since the late 1940s and early 1950s. For more on the historical background of such
                                  considerations, consult pages 456-462 of D. E. Knuth [3]. Another way to store information
                                  about a graph is the adjacency list representation. (See Supplementary Exercise 11.) In the
                                  study of data structures, linked lists and doubly linked lists are prominent in implementing
                                   such a representation. For more on this, consult the text by A. V. Aho, J. E. Hopcroft, and
                                  J.D. Ullman [1].
                                     With regard to graph theory, we are in an area of mathematics that dates back to 1736
                                  when the Swiss mathematician Leonhard Euler (1707-1783) solved the problem of the
                                  seven bridges of Kénigsberg. Since then, much more has evolved in this area, especially in
                                  conjunction with data structures in computer science.
                                     For similar coverage of some of the topics in this chapter, see Chapter 3 of D. F. Stanat
                                  and D. F. McAllister [6]. An interesting presentation of the “Equivalence Problem” can be
                                  found on pages 353-355 of D. E. Knuth [3] for those wanting more information on the role
                                   of the computer in conjunction with the concept of the equivalence relation.
                                       The early work on the development of the minimization process can be found in the
                                   paper by E. F Moore [5], which builds upon prior ideas of D. A. Huffman [2]. Chapter 10
                                   of Z. Kohavi [4] covers the minimization process for different types of finite state machines
                                   and includes some hardware considerations in their design.

REFERENCES
                                      1. Aho, Alfred V., Hopcroft,   John E., and Ullman,      Jeffrey D. Data    Structures and Algorithms.
                                         Reading, Mass.: Addison-Wesley, 1983.
                                      2. Huffman, David A. “The Synthesis of Sequential Switching Circuits.” Journal of the Franklin
                                         Institute 257, no. 3: pp. 161-190; no. 4: pp. 275-303, 1954.
                                      3. Knuth, Donald E. The Art of Computer Programming,           2nd ed., Volume   1, Fundamental Algo-
                                         rithms. Reading, Mass.: Addison-Wesley, 1973.
                                      4. Kohavi, Zvi. Switching and Finite Automata Theory, 2nd ed. New York: McGraw-Hill,              1978.
                                      5. Moore, E. F. ““Gedanken-experiments on Sequential Machines.” Automata Studies, Annals of
                                         Mathematical Studies, no. 34: pp. 129-153. Princeton, N.J.: Princeton University Press, 1956.
                                      6. Stanat, Donald F., and McAllister, David F. Discrete Mathematics in Computer Science. Engle-
                                        wood Cliffs, N.J.: Prentice-Hall,    1977.

b)    r R, is reflexive on A if and only if each &R, is reflex-
              SUPPLEMENTARY EXERCISES                                           ive on A.
                                                                                     z€

1. Let A be a set and / an index set where, for each i € 7, ®,              2. Repeat Exercise | with “reflexive” replaced by (1) symmet-
is arelation on A. Prove or disprove each of the following.                 ric; ii) antisymmetric, (iii) transitive.
      a) U R, is reflexive on A if and only if each *, is reflex-             3. Fora set A, let R; and R2 be symmetric relations on A. If
         te

ive on A.                                                             R, o Ry C Ry oR), prove that Ry oR. = RoR).
                                                                                                                        Supplementary Exercises            379

4, For each of the following relations on the set specified,                              made up of an adjacency list for each vertex v and an index list.
determine whether the relation is reflexive, symmetric, anti-                              For the graph shown in Fig. 7.27, the representation is given by
symmetric, or transitive. Also determine whether it is a partial                           the two lists in Table 7.5.
order or an equivalence relation, and, if the latter, describe the
partition induced by the relation.
    a) & is the relation on Q where a KR b if |a — b| < 1.
    b) Let T be the set of all triangles in the plane.                               For
    th, t € T, define t; Rt if t), tf have the same area.
    c) For T as in part (b), define R by t; RK h if at least two
    sides of t; are contained within the perimeter of th.
    d) Let A = {1, 2, 3, 4, 5, 6, 7}. Define R on A by x R y
    if xy >       10.
                                                                                                          Figure 7.27
5. For    sets        A, B,      and     C     with   relations    AR, C A X B      and
Ry C BX C, prove or disprove that (R; o R2)° = KS o Ri.                                                 Table 7.5
  6. For aset A, let C = {P,|P, is a partition of A}. Define rela-                                        Adjacency List          Index List
tion R on C by P; KR P; if P; < P; —thatis, P; is a refinement
                                                                                                              1         ]         1           1
    a) Verify that & is a partial order on C.                                                                2          2         2          4
    b) For A = {1, 2, 3, 4, 5}, let P;, 1 <i < 4, be the follow-
                                                                                                             3          3         3          5
    ing partitions: P;: {1, 2}, {3, 4,5}; Po: {1, 2}, {3, 4}, {5};                                           4          6         4          7
    P3: {1}, {2}, (3,4, 5}; Pa: {1,2}, {3}, {4}, {5}. Draw the                                               5           ]        5          9
    Hasse diagram forC = {P,|1 <i <4}, where C is partially                                                  6          6         6          9
    ordered by refinement.                                                                                   7          3         7        11
7, Give an example of a poset with 5 minimal (maximal) ele-                                                  8         5         8        11
ments but no least (greatest) element.                                                                       9          2
                                                                                                             10         7
8. Let A = {1, 2. 3, 4, 5, 6} X {1, 2, 3, 4, 5, 6}. Define
                                                         R on
A by (11, yi) R (x2, ya), iPaxaiyi = x2 yr.
    a) Verify that & is an equivalence relation on A.                                           For each vertex v in the graph, we list, preferably in numer-
    b) Determine the equivalence                          classes    [(1, 1)], [(2, 2)],   ical order, each vertex w that is adjacent from v. Hence for 1,
    [(3, 2)], and [(4, 3)].                                                                we list 1, 2, 3 as the first three adjacencies in our adjacency list.
                                                                                           Next to 2 in the index list we place a 4, which tells us where
9. If the complete graph K,, has 45 edges, what is n?
                                                                                           to start looking in the adjacency list for the adjacencies from 2.
10. Let# = { f: Z* > R} —thatis,                         F is the set of all functions     Since there is a 5 to the right of 3 in the index list, we know
with domain Z* and codomain R.                                                             that the only adjacency from 2 is 6. Likewise, the 7 to the right
    a) Define the relation R on ¥ by ge RA, for g,h eF, if                                 of 4 in the index list directs us to the seventh entry in the adja-
    g is dominated by A and A is dominated by g — that is,                                 cency list —namely, 3 — and we find that vertex 4 is adjacent
    g € OCA). (See Exercises 14, 15 for Section 5.7.) Prove                                to vertices 3 (the seventh vertex in the adjacency list) and 5 (the
    that R is an equivalence relation on #.                                                eighth vertex in the adjacency list). We stop at vertex 5 because
                                                                                           of the 9 to the right of vertex 5 in the index list. The 9’s in the
    b) For f €&, let [f] denote the equivalence class of f
                                                                                           index list next to 5 and 6 indicate that no vertex 1s adjacent from
    for the relation & of part (a). Let ¥’ be the set of equiva-
                                                                                           vertex 5. In a similar way, the 11’s next to 7 and 8 in the index
    lence classes induced by &. Define the relation F on #’ by
                                                                                           list tell us that vertex 7 is not adjacent to any vertex in the given
    [el ¥ [A], for [g}, [2] € #’, if g is dominated by h. Verify
                                                                                           directed graph.
    that ¥ is a partial order.                                                                  In general, this method provides an easy way to determine
    c)    For     &®    in part    (a),       let i, fi: ho   € &   with fis   ha E [f].   the vertices adjacent from a vertex v. They are listed in the
    If fi + fo: Z* > R is defined by (fi + fo)(a) = fi(n) +                                positions index(v), index(v) + 1,..., index(v + 1) — 1 of the
    fo(n), forn € Z*, prove or disprove that f; + fo € [f].                                adjacency list.
11. We have seen that the adjacency matrix can be used to                                       Finally, the last pair of entries in the index list — namely, 8
represent a graph. However, this method proves to be rather in-                            and 11 — is a “phantom” that indicates where the adjacency list
efficient when there are many 0’s (that is, few edges) present. A                          would pick up from if there were an eighth vertex in the graph.
better method uses the adjacency list representation, which is                                  Represent each of the graphs in Fig. 7.28 in this manner.
380            Chapter 7 Relations: The Second Time Around

(a)

Figure 7.28

12. The adjacency list representation of a directed graph G is                    Table 7.7
given by the lists in Table 7.6. Construct G from this represen-
tation.                                                                                             v              @
              Table 7.6                                                                        0         1     0        ]
                Adjacency List            Index List
                                                                                      Sy]     S7        S6     ]        0

1         2              l       1                                 S52      S7        S7     0        0

2         3             2        4                                  $3      $7        $2     1        0
                  3         6             3        5                                  54      S§2       S53    0        0

4         3             4        5                                  $5      83        $7     0        0
                  5         3             5        8                                  56      S4        S|     0        0

6         4             6       10                                  S7      $3        55     1        0
                  7         5             7       10                                  Sg      S7        $3     0        0
                  8         3             8       10
                  9         6
                                                                           b) For all 2 <n < 35, show that the Hasse diagram for the
13. Let G be an undirected graph with vertex set V. Define the             set of positive-integer divisors of n looks like one of the
relation 2 on V by v Rw if v = w orif there is a path from v               nine diagrams in part (a). (Ignore the numbers at the ver-
to w (or from w to v since G is undirected). (a) Prove that R              tices and concentrate on the structure given by the vertices
is an equivalence relation on V. (b) What can we say about the             and edges.) What happens for n = 36?
associated partition?
                                                                           c) For n € Zt, t(n) = the number of positive-integer di-
14, a) For the finite state machine given in Table 7.7, determine          visors of n. (See Supplementary Exercise 32 in Chapter 5.)
    a minimal machine that is equivalent to it.                            Let m,n € Z* and S, T be the sets of all positive-integer
      b) Find a minimal string that distinguishes states s4 and s¢.        divisors of m, n, respectively. The results of parts (a) and
                                                                           (b) imply that if the Hasse diagrams of S, T are structurally
15, At the computer center Maria is faced with running 10 com-
                                                                           the same, then t(m) = t(n). But is the converse true?
puter programs which, because of priorities, are restricted by
the following conditions: (a) 10 > 8, 3; (b) 8 > 7; (c) 7>5;               d) Show that each Hasse diagram in part (a) is a lattice if we
(d) 3 > 9, 6; (e) 6> 4, 1; (1) 9 > 4, 5; (g) 4,5, 1 > 2; where,            define glb{x, y} = gcd(x, y) and lub{x, y} = Icm(x, y).
for example, 10 > 8, 3 means that program number 10 must be            17. Let U denote the set of all points in and on the unit square
run before programs 8 and 3. Determine an order for running            shown in Fig. 7.29. Thatis,U = {(x, y|O<x <1,0<y< ]}.
these programs so that the priorities are satisfied.                   Define the relation R on U by (a, b) R (c, d) if (1) (a, b) =
                                                                       (c, d),or (2) b = danda = Oandc = 1,or(3)b = danda = |
16. a) Draw the Hasse diagram for the set of positive inte-
                                                                       and c = 0.
    ger divisors of (i) 2; (ii) 4; (111) 6; (iv) 8; (v) 12; (vi) 16;
    (vii) 24; (viii) 30; (ix) 32.                                          a) Verify that & is an equivalence relation on U.
                                                                                                                  Supplementary Exercises         381

(A, ©), find two maximal chains. How many such maximal
                                                                                          chains are there for this poset?
                     (0, 1)                       (1, 1)                                  d) IfU = {1, 2,3, ..., 2}, how many maximal chains are
                                                                                          there in the poset (PCU), C)?
                                                                                      22. For # # C C A, let (C, &’) be a maximal chain in the poset
                                                                                      (A, KR), where R’ = (C X C) OR. If the elements of C are or-
                                                                                      dered as c; R’ cp R’--- RK’ cy, prove that cy is a minimal ele-
                                                                                      ment in (A, 9) and that c, is maximal in (A, R).
                     (0, 0)                 (1, 0)
                                                                                      23. Let (A, %) be a poset in which the length of a longest
                     Figure 7.29
                                                                                      (maximal) chain is n > 2. Let M be the set of all maximal ele-
                                                                                      ments in (A, %), and lett B= A-— M. TER’ = (BX BN AR,
                                                                                      prove that the length of a longest chain in (B, R’) isn — 1.
    b) List the ordered pairs in the equivalence classes
                                                                                      24. Let (A, %) be a poset, and let     ACCA. TF (CX C)N
    [(0.3, 0.7)}, [(0.5, 0)], [(0.4. 1)], [(0, 0.6)}, [C1, 0.2)). For
                                                                                      R = G, then for all distinctx, y € C wehavex Ay andy Ax.
    O0O<a<1,0<5<1,                 how     many      ordered    pairs      are   in
                                                                                      The elements of C are said to form an antichain in the poset
    [(a, 5)]?
                                                                                      (A, KR).
    c) If we “glue together” the ordered pairs in each equiva-
                                                                                          a) Find an antichain with three elements for the poset given
    lence class, what type of surface comes about?
                                                                                          in the Hasse diagram of Fig. 7.18(d). Determine a largest
18. a) ForU = {1, 2, 3}, let A = PU). Define the relation R                               antichain containing the element 6. Determine a       largest
    on A by B R Cif B C C. How many ordered pairs are there                               antichain for this poset.
    in the relation R?
                                                                                          b) If U = {1, 2, 3, 4}, let A = PCU). Find two different
    b) Answer part (a) forU = {1, 2, 3, 4}.                                               antichains for the poset (A, ©). How many elements occur
    c) Generalize the results of parts (a) and (b).                                       in a largest antichain for this poset?
19. Forn € Z*, lett = {1, 2, 3, ..., n}. Define
                                           the relation                                    c) Prove that in any poset (A, &), the set of all maximal
on P(U) by AR B if A ¢ B and B ¥ A. How many ordered                                      elements and the set of all minimal elements are antichains.
pairs are there in this relation?                                                     25. Let (A, %) be a poset in which the length of a longest chain
20. Let A be a finite nonempty set with B C A (B fixed), and                          is n. Use mathematical induction to prove that the elements of
|A| =n, |B| = m. Define the relation R on P(A) by X RY,                               A can be partitioned into n antichains C,, C2, ..., C, (where
for X, Y CA,if XN B = Y OB. Then         & is an equivalence re-                      C,AC, =, fori <i<j <n).
lation, as verified in Exercise 10 of Section 7.4. (a) How many
                                                                                      26. a) Inhow many ways can one totally order the partial order
equivalence classes are in the partition of P(A) induced by R?
                                                                                          of positive-integer divisors of 96?
(b) How many subsets of A are in each equivalence class of the
partition induced by 2?                                                                   b) How     many   of the total orders in part (a) start with
                                                                                          96 > 32?
21. For A # 9, let (A, %) be a poset, and let 6 # B C A such
that RR’ = (BX BY OR. If (B, R’) is totally ordered, we call                               c) How many of the total orders in part (a) end with 3 > 1?
(B, R’) a chain in (A, R). In the case where B is finite, we may                          d) How many of the total orders in part (a) start with
order the elements ofB by b} R’ bo R’ bz FR’ - - + R’ b,_ R’ b,,                          96 > 32 and end with 3 > 1?
and say that the chain has length n. A chain (of length n)                                e) How many of the total orders in part (a) start with
is called maximal if there is no element a € A where a ¢
                                                                                          96 > 48 > 32 > 16?
{b), bo, b3,..., b,} anda Rb, b, Ra, or b Ra KR b, 41, for
some   1 <i<n-— 1.                                                                    27. Let n be a fixed positive integer and let A, = {0, 1,
                                                                                      ..., a} ON. (a) How many edges are there in the Hasse di-
    a) Find two chains of length 3 for the poset given by the
                                                                                      agram for the total order (A,, <), where “<” is the ordinary
    Hasse diagram in Fig. 7.20, Find a maximal chain for this                         “tess than or equal to” relation? (b) In how many ways can the
    poset. How many such maximal chains does it have?                                 edges in the Hasse diagram of part (a) be partitioned so that the
    b) For the poset given by the Hasse diagram in Fig. 7.18(d),                      edges in each cell (of the partition) provide a path (of one or
    find two maximal chains of different lengths. What is the                         more edges)? (c) In how many ways can the edges in the Hasse
    length of a longest (maximal) chain for this poset?                               diagram for (Az, <) be partitioned so that the edges in each
    c) Let      U=     {1, 2, 3,4}   and    A = PAL).          For   the    poset     cell (of the partition) provide a path (of one or more edges) and
                                                                                      one of the cells is {(3, 4), (4, 5), (5, 6), (6, 7)}?
    PART

2
  FURTHER
  TOPICS IN
ENUMERATION
      The Principle
       of Inclusion
      and Exclusion

W:      now return to the topic of enumeration as we investigate the Principle of Inclusion
                           and Exclusion. Extending the ideas in the counting problems on Venn diagrams in
                     Chapter 3, this principle will assist us in establishing the formula we conjectured in Section
                     5.3 for the number of onto functions    f: A >      B, where A, B are finite (nonempty)   sets.
                     Other applications of this principle will demonstrate its versatile nature in combinatorial
                     mathematics.

8.1
The Principle of Inclusion and Exclusion
                     In this section we develop some notation for stating this new counting principle. Then
                     we establish the principle by a combinatorial argument. Following this, a wide range of
                     examples demonstrate how this principle may be applied.

We shall motivate the Principle of Inclusion and Exclusion with a series of three exam-
                     ples, the first two of which will be reminiscent of the work we did with counting and Venn
                     diagrams in Section 3.3.

Let S represent the set of 100 students enrolled in the freshman engineering program at Cen-
  EXAMPLE 8.1
                     tral College. Then |S| = 100. Now let c), cz denote the following conditions (or properties)
                     satisfied by some of the elements of S:

cy: Astudent at Central College is among the 100 students in the freshman engineering
                         program and is enrolled in Freshman Composition.
                         co: Astudent at Central College is among the 100 students in the freshman engineering
                         program and is enrolled in Introduction to Economics.

Suppose that 35 of these 100 students are enrolled in Freshman Composition and that
                     30 of them are enrolled in Introduction to Economics. We shall denote this by

N(c1}) =35      and    N(c2)   = 30.

If nine of these 100 students are enrolled in both Freshman Composition and Introduction
                     to Economics then we write N(c,c2)     = 9.

385
386        Chapter 8 The Principle of Inclusion and Exclusion

Further, of these 100 students, there are 100 ~ 35 = 65 who are not taking Freshman
                             Composition. Denoting |S| by N, we can designate this by writing N(c;) = N — N(c)).
                             In a similar way we designate that there are N(c2) = N — N(c2) = 100 — 30 = 70 of
                             these students who are not taking Introduction to Economics. The number who are taking
                             Freshman Composition and who are not taking Introduction to Economics is N(c;¢2) =
                             N(c1) — N(e;c2) = 35 — 9 = 26. Likewise, of these 100 students, there are N(¢,c2) =
                             N(e2) — N(c)c2)       = 30 — 9 = 21 who are enrolled in Introduction to Economics but not in
                             Freshman Composition. Of particular interest are those students (from among these 100
                             freshmen) who are taking neither Freshman Composition nor Introduction to Economics —
                             that is, they are not taking Freshman Composition and they are also nor taking Introduction
                             to Economics. Their number is N(¢)¢2). And since N(c,;) = N(c1¢e2) + N(C1C2), we learn
                             that NV (c\C2) = N(c\) — N(e€\c2) = 65 — 21 = 44.
                                 The preceding observations also demonstrate that

N(cyc2) = N(c1) — N(eye2) = [N ~ N(c1)] — [N(c2) — N(erc2)]
                                              = N ~ N(cy) — N(c2) + N(cie2) = N — [N(e1) + N(c2)] + N(cic2)
                                              = 100 — [35 + 30] + 9 = 44,        as we saw above.
                                From the Venn diagram in Fig. 8.1, we see that if N(c)) denotes the number of elements
                             of S in the left-hand circle and N(c2) denotes the number in the right-hand circle, then
                             N(c;¢2) is the number of these elements from S in the overlap, while N (c;c2) counts those
                             elements of S that are outside the union of these two circles. Consequently, we see once
                             again — this time from the figure — that

N(€\€2) = N ~[N(c1) + N(e2)] + N(c1c2),
                             where the last term is added on because it was eliminated twice in the term [ VN (c,) + N(c2)].
                             (Also, at this point, the reader may wish to look back at the second formula following
                             Example 3.25 to find the same result presented with a different notation.)

N(C4€)

N(C4C>)

Figure 8.1

[Before we advance to our next example where we will introduce a third condition, let us
                             note that N(c)C2) is not the same as N(c1C2). For N(c,¢3) = N — N(cyc2) = 100 —9 =
                             91, in this example, while N(¢c;c2) = 44, as we learned earlier. However,       N(¢;   or ¢2) =
                             N(eqc2) = 91 = 65+ 70          — 44 = N(¢C}) + N(@o) — N(€1e2).]

We start with the same 100 students as in Example 8.1 and the same conditions c), c2, but
      EXAMPLE 8.2
                             now we consider a third condition, given as follows:

c3: Astudent at Central College is among the 100 students in the freshman engineering
                                program and is enrolled in Fundamentals of Computer Programming.
                                                             8.1 The Principle of Inclusion and Exclusion       387

It is still the case that N(c,}) = 35, N(e2)   = 30, and N(c)c2)     = 9, but now we are also given
              that N(c3)   = 30, N(c)c3)    = 11, N(c2c3)     = 10, and N(c)c2c3)       = 5 (that is, there are five
              of these 100 freshmen who are taking Freshman Composition, Introduction to Economics,
              and Fundamentals of Computer Programming). Looking to Fig. 8.2, we learn that

N(c\¢2¢3) = N —[N(e1) + N(e2) + N(e3)] + [N(e1¢2) + N(c1e3) + N(c203)]
                                   — N{c\c203).

So here we have N(¢;¢2¢3) = 100 — [35 + 30 + 30] + [9 + 11 + 10] — 5 = 30. That is,
              out of these 100 students there are 30 who are not enrolled in any of the courses:
              (i) Freshman Composition; (ii) Introduction to Economics; or (iii) Fundamentals of Com-
              puter Programming.
                  [We also learn here that V(¢c3) = 70 = 100 — 30 = N — N(e3), N(€1¢3) = 46 = 100 —
              [35 + 30] +11 = N —[N(e}) + N(c3)] + N(cie3), and N (€2¢3) = 50 = 100 — [30 + 30]
              + 10= N —[N(c2) + N(e3)] + N(c2¢3). Furthermore, we note the similarity here with the
              result for |A M BM C| given in the second formula following Example 3.26.]

N(C4C5C3)

N(c,CC3)

J
                                                               N(c> C3)

Figure 8.2

Based on the results in the previous two examples we may now feel that for a given finite
EXAMPLE 8.3
              set $ (with |S| = N) and four conditions c;, c2, ¢3, cg we should have

N(€\€2¢3¢4) = N — [N(c1) + N(c2) + N(c3) + N(ca)]                           (*)
                              + [N(c1¢2) + N(cye3) + N(e1¢4) + N(c2€3) + N(c2c4) + N(€3¢4)]
                              ~ [N(c1e2¢3) + N(cc2¢4) + N(eic3¢e4) + N(c2030¢4)]
                              + N(c1€2€3€4).
              To show that this is the case we consider an arbitrary element x from S and show that it is
              counted the same number of times on both sides of the above equation.
                 0) If x satisfies none of the four conditions, then it is counted once on the left side of
                    Eq. (*) [in N(¢)c2¢3¢4)], and once on the right side of Eq. (*) [in NV].
                 1) If x satisfies only one of the conditions, say c,, then it is not counted at all on the left
                    side of Eq. (*). But on the right side of Eq. (*), x is counted once in N and once in
                    N(c)), for a total of 1 — 1 = 0 times.
388   Chapter 8 The Principle of Inclusion and Exclusion

2) Now suppose that x satisfies conditions c2, cq but does not satisfy conditions ¢;, c3.
                              Once again x is not counted on the left side of Eq. (*). For the right side of Eq. (*),
                              x is counted once in NV, once in each of N (cz) and N(c4), and then once in N(c2¢4),
                               totaling 1 — [1 +1] +1=1-({) + () =0 times.
                           3) Continuing with the case for three conditions, we'll suppose here that x satisfies
                               conditions c,, c2, and c4, but not c3. As in the previous two cases, x is not counted
                               on the left side of Eq. (*). On the right side of Eq. (*), x is counted once in N,
                               once in each of N(c)), N(c2), and N(cq4), once in each of N(c)c2), N(c;,c4), and
                               N (c2c4), and, finally, once in N(c;c2c4). So on the right side of Eq. (*), x is counted
                               1—[fl1+141)+fl14+1+4+1])~—1=1-—()                 + (3) — Q) = 0 times, in total.
                            4) Finally, if x satisfies all four of the conditions ¢c;. cz, c3, cq, then once again it is not
                               counted on the left side of Eq. (*). On the right side of Eq. (*), x is counted once for
                               each of the 16 terms on the right side of this equation — for atotalof 1 —[1+1+1+
                               I+ fL+1414+14+141)-(141414+041=1-()+0-@+Q@=
                               O times.
                                  Consequently, from these preceding five cases we have shown that the two sides
                               of Eq. (*) count the same elements from S, and this provides a combinatorial proof
                               for the formula for N (€;C2€3C4).
                           So now we shall reconsider the situation in Example 8.2 and introduce a fourth condition
                        as follows:

c4: Astudent at Central College is among the 100 students in the freshman engineering
                           program and is enrolled in Introduction to Design.

We already know that N(c,) = 35, N(c2) = 30, N(c3) = 30, N(cy¢e2) = 9, N(e1¢3) = 11,
                        N(ce2¢3) = 10, and N{c,c2c3) = 5. If N(c4) = 41, N(eyc4) = 13, N(coc4) = 14, N(030€4)
                        = 10, N(c,c2¢e4) = 6, N(c1¢3¢4) = 6, N(c2¢3¢4) = 6, and N(c\c2¢3c4) = 4, then, using
                        the equation we derived above, it follows that N(¢,¢2¢3¢4)         = 100 — [35 + 30+       304 41]
                        +[9+11+134+ 10+ 144+ 10] —-[5+6+6+46]+4= 100—- 136+ 67—23+4=
                        12. Thus, of the 100 students in the freshman engineering program at Central College,
                        there are 12 who are not taking any of the four courses: Freshman Composition, Intro-
                        duction to Economics, Fundamentals of Computer Programming, or Introduction to De-
                        sign.
                            If we are interested in the number (from these 100 students) who are taking Fresh-
                        man Composition, but none of the other three courses, then we should want to compute
                        N(c;€2€3¢4). To do so we start by observing that

N(€20€3€4) = N(c1€203C4) + N(€1€2€3€4),
                        which can be established by an argument similar to the one above for N (¢;C2c¢3¢4). This
                        then leads us to

N(cyC2¢3C4) = N(C2¢3C4) — N(€1€2€3C4).
                        Using the result in Example 8.2 we find that

N(€2¢3¢4) = N — [N(c2) + N(c3) + N(c4)] + [N (203) + N(c2c4) + N(c3¢4)]
                                              — N(c2¢3¢4)
                                           = 100 — [30+ 30+ 41] + [10+ 144 10]
                                                                             — 6 = 27, and
                                           N (cyOo0304) = N (00304) — N(€10r0304) = 27 — 12 = 15.
                                                          8.1 The Principle of Inclusion and Exclusion      389

So there are 15 students in this set of 100 who are taking Freshman Composition, but none
              of the other courses: Introduction to Economics, Fundamentals of Computer Programming,
              or Introduction to Design.
                 Further, we also observe that

N(ceyc203¢4) = N(e2€3C4) — N(€1€2€3C4)
                             = {N —[N(c2) + N(cs) + N(ca)] + [N (C203) + N (Crea) + N(C3¢a)]
                               ~ N(cx¢3¢4)} — {N —[N(c1) + N(c2) + N(c3) + N(ca)]
                               + [N(c1e2) + N(e1e3) + N(e1e4) + N(c2€3) + N(c2c4) + N(c3¢4)]
                               — [N(e1e2€3) + N(c1e2c4) + N(c103¢4) + N(c203¢4)] + N(cic2c3c4)}, or
                         N(c1€2¢3€4) = N(cy) — [N(cie2) + N(cic3) + N(c1c4)]
                                        + [N(c1c2¢3) + N(cic2¢4) + N(c103¢4)] — N(e1c2€3¢4).
              So here N(c)¢203¢4)    = 35 — [9+    114   13] 4+ [5+6+4 6]      —4 = 35 —33417-4=             15,
              as we found above.

Having seen the results in Examples 8.1, 8.2, and 8.3, now it is time for us to generalize
              these results and establish the Principle of Inclusion and Exclusion. To do so we once again
              let S be a set with   |S| = NV, and we let c), co,...,c¢;, be a collection oft conditions       or
              properties — each of which may be satisfied by some of the elements of S$. Some elements
              of S may satisfy more than one of the conditions, whereas others may not satisfy any of
              them. For all 1 <i < +t, N(c;) will denote the number of elements in S that satisfy condition
              c;. (Elements of S are counted here when they satisfy only condition c;, as well as when
              they satisfy c; and other conditions c;, for 7 #7.) For all 7, j € {1, 2, 3,..., t} where
              i # j, N(c;c;) will denote the number of elements in S that satisfy both of the conditions
              c;, cj, and perhaps some others. | NV (c;c,) does not count the elements of S that satisfy only
              c;, c;.] Continuing, if 1 <i, j, k < t are three distinct integers, then N(c;c;c,) denotes the
              number of elements in S satisfying, perhaps among others, each of the conditions ¢;, cj,
              and Ck.
                  For each 1 <i <t, N(c;) = N — N(c;) denotes the number of elements in S that do
              not satisfy condition c;. If 1 <i, j <t withi # j, N(¢;c;) = the number of elements in S$
              that do not satisfy either of the conditions ¢; or c;. [This is not the same as N(€;C;), as we
              observed at the end of Example 8.1.]

With the necessary preliminaries now in hand we state the following theorem.

THEOREM 8.1   The Principle of Inclusion and Exclusion. Consider a set S, with |S| = N, and condi-
              tions c;, 1 <i <t, each of which may be satisfied by some of the elements of S. The
              number of elements of S that satisfy none of the conditions c;, 1 <i <1, is denoted by
              N = N(€1C2¢3-- - €) where

N= N-—[N(c\)
                              + N(x) + N(e3) ++ + NCC]                                                       (1)
                            + [N(cic2) + N(cye3) +--+ + N(cie;) + N(c2¢3) ++                 + N(cr-1¢7)]
                            — [N(eic2c3) + N(cye2e4) +++ + + N(cye20;) + N(e1¢3¢4) +--+:
                            + N(c1e3¢;) +++    + N(e;-2¢;-1¢1)] +++        + (1)
                                                                               N (ey e203 + + + er),
390      Chapter 8 The Principle of Inclusion and Exclusion

or

N=N- Qo NG)+                               DO      NGie)-—       DD            Neejedto-
                                                          l<r<z                 l<i<j<r              l<i<j<k<t

+ (-1)N(e102¢3 ++ + cr).
                          Proof: Although this result can be established by applying the Principle of Mathematical
                          Induction to the number ¢ of conditions, we shall give a combinatorial proof. The argument
                          will be reminiscent of the ideas we saw in Example 8.3 in establishing the formula for
                           N (€1€2€3C4).
                                For each x € S we show that x contributes the same count, either 0 or 1, to each side of
                           Eq. (2).                                                                                    _
                                If x satisfies none of the conditions, then x is counted once in N and once in N, but not
                           in any of the other terms in Eq. (2). Consequently, x contributes a count of 1 to each side
                           of the equation.
                               The other possibility is that x satisfies exactly r of the conditions where 1 <r <t. In
                           this case x contributes nothing to N. But on the right-hand side of Eq. (2), x is counted
                                 (1)   One time in N.

(2)   r times in »                 N(c;). (Once for each of the r conditions.)
                                                          l<i<t

r        .           .                                              .               <a:
                                 (3)   ( ) times in        >    N(c;c;). (Once for each pair of conditions selected from
                                         2             i<i<j<t
                                       the r conditions it satisfies.)
                                         r    .           .
                                 (4)     3   times in               »          N(c;c;cx).   (Why?)
                                                                  l<i<j<k<r
                                       CY

r                                                                                         ;
                           (r+1)        ( ) = | time in >                     N(c;,€i, «++ ¢;,), Where the summation               is taken over all
                                         r
                                       selections of size r from the f conditions.

Consequently, on the right-hand side of Eq. (2), x is counted

(2)r <1                                          cor =o = osimes,
                                                      r             r                        r

3 tear
                                       r+ (S)—(G)
                                           2

by the binomial theorem. Therefore, the two sides of Eq. (2) count the same elements from
                           S, and the equality is verified.

An immediate corollary of this principle is given as follows:

COROLLARY 8.1             Under the hypotheses of Theorem 8.1, the number of elements in S that satisfy at least one
                          of the conditions c;, where 1 <i                    <1, is given by N(c, orc           or   ...   orc;) =N—N.

Before solving some examples, we examine some further notation for simplifying the
                           statement of Theorem 8.1.
                                                                8.1. The Principle of Inclusion and Exclusion     391

We write

So=N,
                      S) =[N(c1) + N(c2) +--+ + Nee),
                      Sp = [N(cic2) + N(cye3) +--+          + Neier) + N(c2¢3) +--+               + N(er-1e1)],
              and, in general,

Sie = YO N(Cney + Cy LS Kk SH,

where the summation is taken over all selections of size k from the collection of t conditions.
              Hence S; has (;,.) summands in it.
                 Using this notation we can rewrite the result in Eq. (2) as

N =So—S;       +8.     —S3t---+(-D'S;.

Now let us look at how this principle is used to solve certain enumeration problems.

Determine the number of positive integers n where 1 <n < 100 and a is not divisible by
EXAMPLE 8.4
              2, 3, or5.
                  Here S = {1, 2, 3,..., 100} and N = 100. Forn € S.n satisfies

a) condition c, if 7 is divisible by 2,
                b) condition c2 if n is divisible by 3, and
                c) condition c3 if n is divisible by 5.

Then the answer to this problem is N(¢|C2C3).
                   As in Section 5.2 we use the notation |r| to denote the greatest integer less than or equal
              to r, for any real number r. This function proves to be helpful in this problem as we find
              that
                 N(c1) = |100/2| = 50 [since the 50 (= |100/2]) positive integers 2, 4, 6, 8,..., 96,
                 98 (= 2 - 49), 100 (= 2 - 50) are divisible by 2];
                 N(cz) = |100/3| = [33 1/3] = 33 [since the 33 (= |100/3]) positive integers 3, 6, 9,
                 12,..., 96 (= 3 - 32), 99 (= 3 - 33) are divisible by 3];
                 N(c3) = [100/5]| = 20;
                 N(c1c2)    =    [100/6]   = 16 [since there are 16 (=     |100/6]) elements in S that are divisible
                 by both 2 and 3 hence
                                 —     divisible by Icm(2, 3) = 2-3 = 6];
                 N(c)¢3) = | 100/10] = 10;
                 N(c2c3) = [100/15] = 6; and
                 N(cye2¢3) = 100/30]          = 3.
              Applying the Principle of Inclusion and Exclusion, we find that

N (€4€2€3) = So — S, + So — S83 = N ~[N(c1) + N(c2) + N(e3)]
                                             + [N(cyc2) + N(e1¢3) + N(c2¢3)] — N(e1¢2¢3)
                                           = 100 — [50 + 33 + 20] + [16+           10+    6] — 3 = 26.
392        Chapter 8 The Principle of Inclusion and Exclusion

(These 26 numbers are 1, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 49, 53, 59, 61, 67, 71,
                             73,77, 79, 83, 89, 91, and 97.)

In Chapter 1 we found the number of nonnegative integer solutions to the equation
      EXAMPLE 8.5       |    X) + X2 + x3 + x4 = 18. We now answer the same question with the extra restriction that
                             x, <7,forall       1 <i    <4.
                                Here S is the set of solutions of x; + x2 + x3 + x4 = 18, with 0 < x; for all 1 <7 < 4.
                             So |S|= N = So= ("715') = (73):
                                We say that a solution x;, x2, x3, x4 satisfies condition c;, where 1 <i < 4, if x, > 7 (or
                             x, > 8). The answer to the problem is then N (€|¢2€3C4).
                                Here by symmetry             N(c1)   = N(c2)         = N(c3)     = N(c4). To compute          N(c,), we consider
                             the integer solutions for x; + x2. + x3 + x4 = 10, with each x, > 0 for all 1 <7                             <4. Then
                             we add 8 to the value of x, and get the solutions of x; + x. + x3 + x4 = 18 that satisfy
                             condition c;. Hence N(c;) = (jt iO                     ') = (ja), for each 1 <i <4, and S, =            ) (a):
                                Likewise,       N(c;c2) is the number of integer solutions of x; + x2 + x3 + x4 = 2, where
                             x, > 0 forall 1 <i <4. So N(e;c2) = (At57 ') = (3), and S2 = (3)().
                                Since N(c,c,c,) = 0 for every selection of three conditions, and N(c)c2c3¢4) = 0, we
                             have

~o   Le                                                    21           4\   (13         AN    /5
                            N (€4€203C4)    =    So — S| + So — 83 + Sy =                (is)    _    (C0)         +   (>)   (>)   —0+0 =      246.

So of the 1330 nonnegative integer solutions of x; + x2 + x3 + x4 = 18, only 246 of them
                             satisfy x, <7 foreach            1 <i   <4.

Our next example establishes the formula conjectured in Section 5.3 for counting onto
                             functions.

For finite sets A, B, where             |A] =m         >n   =|B|,       let A = {a), a2,..., ay},       B=     {b,, hy,
      EXAMPLE 8.6
                             ..., b,}, and S = the set of all functions f: A— B. Then N = Sy = |S| =n”.
                                 For all 1 <7 <n, let c, denote the condition on S where a function f: A > B satisfies
                             c, if b; is not in the range of f. (Note the difference between c; here and c; in Examples
                             8.4 and 8.5.) Then N(c,) is the number of functions in S that have b; in their range, and
                             N(€\C2-     + €,) counts the number of onto functions f: A >                         B.
                                For all 1 <i      <n,    N(c;) = (n — 1)”, because each element of B, except b,, can be used
                             as the second component of an ordered pair for a function f: A >                           B, whose range does not
                             include b;. Likewise, for all 1 <i              < j <n, there are (n — 2)” functions f: A >                  B whose
                            range contains neither b, nor b;. From these observations we have S; = [N(c1) + N(e2) +
                              +--+ N(cy)] = a(n — 1)” = (7)(n — 1)”, and Sy = [N(c1c2) + N(c1¢3) +--+» + N(cien)
                             + N(c2¢3) + +--+ N(c2¢n) +--+ + N(cn-1en)] = (5) — 2)". In general, for each
                             1<k<n,

S =           -              N(ci¢y
                                                                                        © * Ci) = (J)                  a"
                                                              L<1) <in<-   <ip <n

It then follows by the Principle of Inclusion and Exclusion that the number of onto
                                                                       8.1 The Principle of Inclusion and Exclusion                                   393

functions from A to B is

N(C1€2€3 -- - Cn) = So — Sy) + Sp — 83                      +-   + (-1)"Sh

=a (")n— a+ (Yaar (*)a 9
                                            —     pf         ')            —~1)" +         ")       —   2)"             n     (        —        3)"

eben nmr = Sen (‘ea                             -             i   n               o\       HL

nt
                                                                   {on
                                            =>                ry (            Jen — iy”
                                                 i=0                  not
                Before we finish discussing this example, let us note that
                                                        i                     n

S>é (-b ( n—-i Jo ~iy"
                                                       i=0

can also be evaluated even if m <n. Furthermore, for m <n, the expression

N(€1C2€3 + ++ Cp)
                still counts the number of functions f: A > B, where |A| = m, |B| =n, and each element
                of B is in the range of f. But now this number is 0.
                     For example, suppose that m = 3 < 7 = n. Then N(C1€203 - - - €7) counts the number of
                onto functions f: A —> B for |A| = 3 and |B| = 7. We know this number is 0, and we also
                find that
                 7

YEeViG 7-9 = OF - Ye + OS - 4 + G3 - 2+ ML - Go’
                =                = 343 — 1512 + 2625 — 2240 + 945 — 168 + 7-0-0.
                     Hence, for all m,n € Z*, ifm <n, then

> a'(                 ni—t
                                                                           " Jaa =0.
                                                 i=0

We now solve a problem similar to those in Chapter 3 that dealt with Venn diagrams.

In how many ways can the 26 letters of the alphabet be permuted so that none of the patterns
EXAMPLE 8.7 |   car, dog, pun, or byte occurs?
                     Let S§ denote the set of all permutations of the 26 letters. Then                                      |S| = 26! For each
                1 <i   <4, a permutation in S$ ts said to satisfy condition c, if the permutation contains the
                pattern car, dog, pun, or byte, respectively.
                   In order to compute N (c,), forexample, we count the number of ways the 24 symbols car,
                b,d,e, f,.... Psd. 8,t,...,X, y, z can be permuted. So N(c,) = 24!, and in a similar
                way we obtain

N(ce2) = N(c3) = 24!,                     while N(c4) = 23!

For N(c\c2) we deal with the 22 symbols car, dog, b, e, fo h,i,..., m,n, p.g, 8.t,...,
                x, y, z, which can be permuted in 22! ways. Hence N(c;cz) = 22!, and comparable calcu-
                lations give

N(e1¢3) = N(e2¢3)               = 221,             Ni(cjeg) = 21,                   i #4.
394        Chapter 8 The Principle of Inclusion and Exclusion

Furthermore,

N(c}C2¢3)             = 20},              N(cjejc4)               =    19},                   l<i<        J<        3,

N (ce \¢2¢3¢4) = 17!

So the number of permutations in S that contain none of the given patterns is

N(€\€2€3C4) = 26! — [3(24!) + 23!] + [3(22!) + 3(21!)] — [20! + 3(19!)] + 17!

Our next example deals with a number theory problem.

For n € Z*, n > 2, let @(n) be the number of positive integers m, where | < m <n and
      EXAMPLE 8.8
                             gcd(m, n) = 1— that is, m, n are relatively prime. This function is known as Euler's phi
                            function, and it arises in several situations in abstract algebra involving enumeration. We find
                            that (2) = 1, @(3) = 2, (4) = 2, 6(5) = 4, and $(6) = 2. For each prime p, ¢(p) =
                             p — 1. We would like to derive a formula for ¢() that is related to n so that we need not
                             make a case-by-case comparison for each m, 1 < m <n, against the integer n.
                                The derivation of our formula will use the Principle of Inclusion and Exclusion as in
                             Example 8.4. We proceed as follows: Forn > 2, use the Fundamental Theorem of Arithmetic
                            to write n = p}'p,’--- p;', where pj, p2,..., p; are distinct primes and e, > 1, for all
                             1 <i    < +t. Weconsider the case where t = 4. This will be enough to demonstrate the general
                             idea.
                                With
                                   S =        {1, 2, 3,....n},
                                                         we have N = So = |S| =n,                                                      and for each 1 <i                    <4 we say
                             that k € S satisfies condition c; if k is divisible by p;. For 1 <k <n, ged(k, n) = lifk is
                             not divisible by any of the primes p;, where 1 <i < 4. Hence @(n) = N(c)€2€3€4).
                                 For each 1 <i < 4, we have N(c;) = n/p;; N(cic;) = n/(pip;), forall 1 <i <j <4.
                            Also, N(cjcjce)      = n/(pi
                                                      pj pe), forall 1 <i                             < j <€<4,                        and N(e)c2¢3¢4)                  =
                             n/(P| P2p3p4). So
                                       @(n)   = So — Sy + So — S3 + S4

#1                         n                  nt                 n                           n
                                              wn—[r                  ae         ls]                              +                 +--+
                                                   Pi                          P4                   Pi P2              Pi P3                    P3 P4

-|           A
                                                                              eg                      +=                     i

P1 P2P3                           P2P3P4                       P1 P2P3P4
                                                        1        1                                               1                 1                           1
                                              =n|1l—{—-4+---+—]-+                                                       +                    teeet
                                                      Pi        P4                                          P3 Pp2               Pi P3                    P3P4
                                                                 ]                              1                        1
                                                _ (                     teset                              ) +         ho                |
                                                      Pi P2P3                            P2P3P4                       P\ P2P3P4

———— [Pi P2P3P4 — (p2p3pat pip3pa t+ pip2ps + Prp2Ps)
                                                P\P2P3P4

+ (p3Pp4t+ p2p4a+ pr2p3+ Pipa + Pip3 + Pip2)
                                                — (pat p3t+ pot pi) +1]

[(p1 — 1)(p2 — 1) (p3 — 1) (pa — 1)
                                                P\ P2P3P4

=n?        —-1 —P2-1 p—-l pr-1 p-l ps p-l Jee                                                        1
                                                                                                                                               0--).
                                                       P\                 P2               P3                    P4                      i
                                                                                                                                         i=]              Pi
                                                                   8.1 The Principle of Inclusion and Exclusion        395

In general, O(n) =n II pin(1 — (1/p)),              where the product is taken over all primes          p
               dividing n. When          = p,aprime, (2) = ¢(p) = p[1 — /p)]                        = p — 1, as we observed
               earlier. If n = 23,100, for example, we find that

(23,100) = #(27-3-5?-7-11)
                                 = (23,100)(1 — (1/2))1 — 1/3)) — G/5))0 — G/7))0 — A/1))
                                 = 4800.

The Euler phi function has many interesting properties. We shall investigate some of
               them in the exercises for this section and in the Supplementary Exercises.
                  The next example provides another encounter with the circular arrangements introduced
               in Chapter 1.

Six married couples are to be seated at a circular table. In how many ways can they arrange
EXAMPLE 8.9    themselves so that no wife sits next to her husband? (Here, as in Example 1.16, two seating
               arrangements are considered the same if one is a rotation of the other.)
                  For 1 <7     <6, we let c, denote the condition where a seating arrangement has couple i
               seated next to each other.
                  To determine N(c,), for instance, we consider arranging 11 distinct objects    — namely,
               couple 1 (considered as one object) and the other 10 people. Eleven distinct objects can be
               arranged around a circular table in (11 ~ 1)! = 10! ways. However, here N(c;) = 2(10}),
               where the 2 takes into account whether the wife in couple 1 is seated to the left or right of
               her husband. Similarly, N(c;) = 2(10!), for 2 <i <6, and S, = (°)2(10!).
                   Continuing, let us now compute N(c;c;), for 1 <i < j < 6. Here we are arranging 10
               distinct objects — couple i (considered as one object), couple j (likewise considered as one
               object), and the other eight people. Ten distinct objects can be arranged around a circular
               table in (10 — 1)! = 9! ways. So here N(c;c;) = 27(9!) because there are two ways for the
               wife in couple 7 to be seated next to her husband, and two ways for the wife in couple j to
               be seated next to her husband. Consequently, $2 = (5) 27(9!).
                   Similar reasoning shows us that
                  N(cye2¢3) = 23(8!). S3 = (§)2°(8!)                       N(cye203¢4) = 24(7!), Sy = ($)24(71)
                  N(cic2¢3¢4¢5) = 2°(6!), Ss = (8)2°(6!)                   NM (creze3C4cs¢6) = 2°(5!), So = (8)2°(5)).
               With Sy (the total number of arrangements of the 12 people) = (12 — 1)! = 11!, we find
               that the number of arrangements where no couple is seated side by side is
                                                  6                  6
                            aa           zx           }                       i    6   i            :
                         N(e\c2    toe   C6)   = Sous;         =    SD            (?)2   (11   _—   i)!
                                                 i=0                i=O0

= 39,916,800 — 43,545,600 + 21,772,800 — 6,451,200
                                                 + 1,209,600 — 138,240 + 7680

= 12,771,840.

Our final example recalls some of the graph theory we studied in Chapter 7.

In a certain area of the countryside are five villages. An engineer is to devise a system of
EXAMPLE 8.10
               two-way roads so that after the system is completed, no village will be isolated. In how
               many ways can he do this?
396              Chapter 8 The Principle of Inclusion and Exclusion

Calling the villages a, b, c, d, and e, we seek the number of loop-free undirected graphs
                                   on these vertices, where no vertex is isolated. Consequently, we want to count situations
                                   such as those illustrated in parts (a) and (b) of Fig. 8.3, but not situations such as those
                                   shown in parts (c) and (d).

(a)                        (b)

Figure 8.3

Let S be the set of loop-free undirected graphs G on V = {a, b, c, d, e}. Then N =
                                   So = |S| = 2! because there are (5) = 10 possible two-way roads for these five villages,
                                   and each road can be either included or excluded.
                                      For each    1 <i   <5, let c; be the condition that a system of these roads isolates village

For condition c; village a is isolated, so we consider the six edges (roads) {b, c}, {b, d},
                                   {b, e}, {c, d}, {ec , e}, {d, e}. With two choices for each edge — namely, put the edge in the
                                   graph or leave the edge out
                                                             — we        find that N(c,) = 2°. Then by symmetry N(c;) = 2° for
                                   all2 <i <5, so S, = (7)2°.
                                      When villages a and b are to be isolated, each of the edges {c, d}, {d, e}, {c, e} may be put
                                   in or left out of our graph. This results in 2° possibilities, so N(c;c2)     = 23, and S) = (3)2".
                                      Similar arguments tell us that N(cyc2c3) = 2! and $3 = (3)2'; N(e1c2¢3¢4) = 2                   and
                                   S4 = (3)2°; and N(c,c2¢3¢4c5) = 2° and Ss = (2)2°.
                                      Consequently,
                                                                                                             — (3)2° = 768.

4. Annually, the 65 members of the maintenance staff spon-
                            >a   ah Ase                                 sor a “Christmas in July” picnic for the 400 summer employees
                                                                        at their company. For these 65 people, 21 bring hot dogs, 35
  1. Let S be a finite set with |S| = N and let c), co, c3, c4 be
                                                                        bring fried chicken, 28 bring salads, 32 bring desserts, 13 bring
four conditions, each of which may be satisfied by one or more
                                                                        hot dogs and fried chicken, 10 bring hot dogs and salads, 9
of the elements of S. Prove that N(¢2¢3¢4) = N(c@203€4) +
                                                                        bring hot dogs and desserts, 12 bring fried chicken and sal-
N (€1€2€304).
                                                                        ads, 17 bring fried chicken and desserts, 14 bring salads and
2. Establish the Principle of Inclusion and Exclusion by ap-           desserts, 4 bring hot dogs, fried chicken, and salads, 6 bring hot
plying the Principle of Mathematical Induction to the number t          dogs, fried chicken, and desserts, 5 bring hot dogs, salads, and
of conditions,                                                          desserts, 7 bring fried chicken, salads, and desserts, and 2 bring
3. Of the 100 students in Example 8.3, how many are taking             all four food items. Those (of the 65) who do not bring any of
(a) Fundamentals of Computer Programming but none of the                these four food items are responsible for setting up and cleaning
other three courses; (b) Fundamentals of Computer Program-              up for the picnic. How many of the 65 maintenance staff will
ming and Introduction to Economics but neither of the other             (a) help to set up and clean up for the picnic? (b) bring only hot
two courses?                                                            dogs? (c) bring exactly one food item?
                                                                                                 8.2    Generalizations of the Principle       397

5. Determine the number
                     of positive integersn, 1 <n < 2000,                        15. If eight distinct dice are rolled, what is the probability that
that are                                                                        all six numbers appear?
    a) not divisible by 2, 3, or 5                                              16. How many social security numbers (nine-digit sequences)
    b) not divisible by 2, 3, 5, or 7                                           have each of the digits 1, 3, and 7 appearing at least once?

c) not divisible by 2, 3, or S, but are divisible by 7                      17. In how many ways can three x’s, three y's, and three z’s be
                                                                                arranged so that no consecutive triple of the same letter appears?
6. Determine how many integer solutions there are to
xX) tx.
      + 4x3 +44     =   19, if                                                  18. Frostburg township sponsors four Boy Scout troops, each
                                                                                with 20 boys. If the head scoutmaster selects 50 of these boys to
    a) O<x,     forall] <i<4
                                                                                represent this township at the state jamboree, what is the prob-
    b) 0<x,     <8 foralll
                       <i <4                                                    ability that his selection will include at least one boy from each
    ce) O< x) $5,054)            <6,3<43<57,3<
                                            x4 <8                               of the four troops?
  7. In how many ways can one arrange all of the letters in the                 19. If Zachary rolls a fair die five times, what is the probability
word INFORMATION so that no pair of consecutive letters oc-                     that the sum of his five rolls is 20?
curs more than once? [Here we want to count arrangements such                   20. Ata 12-week conference in mathematics, Sharon met seven
as INNOOFRMTA and FORTMAIINON but not INFORIN-                                  of her friends from college. During the conference she met each
MOTA (where “IN” occurs twice) or NORTFNOIAMI (where                            friend at lunch 35 times, every pair of them 16 times, every trio
“NO” occurs twice).]                                                            eight times, every foursome four times, each set of five twice,
8. Determine    the number      of integer    solutions   to x; + x. +         and each set of six once, but never all seven at once. If she had
x3 +x4 = 19 where —5 < x, < 10 forall 1 <i <4.                                  lunch every day during the 84 days of the conference, did she
9. Determine    the number      of positive   integers    x where    x <
                                                                                ever have lunch alone?
9,999,999 and the sum of the digits in x equals 31.                             21. Compute @() for n equal to (a) 51; (b) 420; (c) 12300.
10. Professor Bailey has just completed writing the final ex-                   22. Compute ¢(”) for m equal to (a) 5186; (b) 5187; (c) 5188.
amination for his course in advanced engineering mathematics.                   23. Let n € Z*. (a) Determine @(2”). (b) Determine ¢(2" p),
This examination has 12 questions, whose total value is to be                   where p is an odd prime.
200 points. In how many ways can Professor Bailey assign the
                                                                                24. For which n € Z* is @(n) odd?
200 points if each question must count for at least 10, but not
more than 25, points and the point value for each question is to                25. How many positive integers n less than 6000 (a) satisfy
be a multiple of 5?                                                             gcd(n, 6000) = 1? (b) share a common             prime divisor with
                                                                                6000?
11. At Flo’s Flower Shop, Flo wants to arrange 15 different
plants on five shelves for a window display. In how many ways                   26. If m,n € Z*, prove that p(n") = n™—'d(n).
can she arrange them so that each shelf has at least one, but no                27. Find three values for n € Z* where @(n) = 16.
more than four, plants?                                                         28. For which positive integers n is @(n) a power of 2?
12. In how many ways can Troy select nine marbles from a bag                    29. For which positive integers n does 4 divide ¢(n)?
of twelve (identical except for color), where three are red, three
                                                                                30. At an upcoming family reunion, five families  — each con-
blue, three white, and three green?
                                                                                sisting of a husband, wife, and one child —are to be seated
13. Find the number of permutations of a, b, c,...,*, y, Z,in                   around a circular table. In how many ways can these 15 people
which none of the patterns spin, game, path, or net occurs.                     be arranged around the table so that no family is seated all
14. Answer the question in Example 8.10 for the case of six                     together? (Here, as in Example 8.9, two seating arrangements
villages.                                                                       are considered the same if one is a rotation of the other.)

8.2
       Generalizations of the Principle
                                   Consider     a set S with         |S| = N,   and conditions        ¢), ¢2, .   ., C; Satisfied by some    of the
                                   elements of S. In Section 8.1 we saw how the Principle of Inclusion and Exclusion provides
                                   a way to determine N(c;C2--+-C;,), the number of elements in S that satisfy none of the r
                                   conditions. If m € Z*        and     1 < m <t, we now want to determine                 E,,, which denotes the
398   Chapter 8 The Principle of Inclusion and Exclusion

number of elements in S that satisfy exactly m of the t conditions. (At present we can obtain
                        Eo.)
                              We can write formulas such as

Ey = N(e,02€3 -- + €;) + N(e:0203 - + Cr) +                + N(E162€3 «+     C-1¢r),

and

Ey = N(e10203 +++ Cr) + N(e1C2c3 ++ Cr) Fs                 FN (610203 + > + Cr-2C1~-1€r),

and although these results do not assist us as much as we should like, they will be a useful
                        starting place as we examine the Venn diagrams for the cases where t = 3 and 4.
                            For Fig. 8.4, where t = 3, we place a numbered condition beside the circle representing
                        those elements of S satisfying that particular condition and we also number each of the
                        individual regions shown. Then £, equals the number of elements in regions 2, 3, and 4.
                        But we can also write

E, = N(c1) + N(c2) + N(c3) — 2 [N(cic2) + N (e103) + N(c2¢3)] + 3N (c1€2€3).
                        In N(c;) + N(c2) + N(c3) we count the elements in regions 5, 6, and 7 twice and those in
                        region 8 three times. In the next term, the elements in regions 5, 6, and 7 are deleted twice.
                        We remove the elements in region 8 six times in 2 [N(c;c2) + N(cyc3) + N(c2¢3)], so we
                        then add on the term 3N(c,c2¢3) and end up not counting the elements in region 8 at all.
                        Hence we have £, = S; — 282 + 383 = 8, — (7) S2 + (5) S3.

PAY
                                                                C2
                                                                            \                 C3

Figure 8.4

When we turn to £>, our earlier formula indicates that we want to count the elements of
                        S in regions 5, 6, and 7. From the Venn diagram,

Ex = N(cyc2) + N(cye3) + N(e2e3) — 3N (c1c2¢3) = Sz — 383 = Sz — (7) $3.
                        and

Ex = N(c\¢2¢3) = $3.
                        In Fig. 8.5, the conditions c;, ¢2, cz are associated with circular subsets of $, whereas cq is
                        paired with the rather irregularly shaped area made up of regions 4, 8, 9, 11, 12, 13, 14, and
                        16. For each 1 <i <4, E; is determined as follows:
                                                           8.2    Generalizations of the Principle      399
F| [regions 2, 3, 4, 5]:

E, =[N(c1) + N(c2) + N(c3) + N(ce4)]
               — 2[N(e)e2) + N(c1c3) + N(eye4) + N (e203) + N(e2e4) + N(c30¢4)]
               +3[N(cice2c3) + N(cye2ce4) + N(c1¢3¢4) + N(c203¢4)]
              aa   4N (c1020€3C4)

= S$, — 2S) + 383 — 484 = S, — (7) S2 + (3) S3
                                                          — (3) Ss.
Note: Taking an element in region 3, we find that it is counted once in £; and once in S,
[in N(c3)]. Taking an element in region 6, we find that it is not counted in £); it is counted
twice in S$; [in both N(c2) and N(c3)] but removed twice in 28> [for it is counted once in $>
in N(c2c3)], so overall it is not counted. The reader should now consider an element from
region 12 and one from region 16 and show that each contributes a count of 0 to both sides
of the formula for £.

C4

:                  M27

(t= 4)

Figure 8.5

E> [regions 6-11]:
   From Fig. 8.5, £2 = $2 — 383 + 6$4 = S$. — (7) S3 + (5) Sa.                 For details on this formula
we examine the results in Table 8.1, where next to each summand                        of 5$>, $3, and S, we
list the regions whose elements are counted in determining that particular summand. In
calculating S; — 3S; + 6S, we find the elements in regions 6-11, which are precisely those
that are to be counted for F>.

Table 8.1

S2                              S3                             S4
           N(c1¢2): 7, 13, 15, 16                 N(c,c2c3):     15, 16         N(e1¢2¢€3¢4): 16
           N(c\¢3): 10, 14, 15, 16                N(c)c2c4): 13, 16
           N(c,c4): 11, 13, 14, 16                N(c,c3c4):     14, 16
           N(e2¢3): 6, 12, 15, 16                 N(e2€3€4):     12,   16
           N(e2c4): 8, 12, 13, 16
           N (e304): 9,        12,     14,   16
400      Chapter 8 The Principle of Inclusion and Exclusion

Finally, the elements                            for £3         are found            in regions            12-15,        and   £3 = $3 — 4S4 = $3 —
                           ({)S4; the elements for E4 are those in region 16, and E4 = Sa.
                                These results suggest the following theorem.

THEOREM 8.2                Under the hypotheses of Theorem 8.1, foreach 1 < m < ft, the number of elements in S that
                           satisfy exactly m of the conditions c), c2,.... ¢; 18 given by

+1                                  m+2                                               _         t
                                        Em       =   Sin     —   (”                  ) Sma           +   (                   ) Snaa   —rett          (—1)'         "(         Js.         (1)
                                                                             ]                                   2                                                      t—m

(If m = 0, we obtain Theorem 8.1.)
                           Proof: Arguing as in Theorem 8.1, let x € S and consider the following three cases.
                                a) When x satisfies fewer than m conditions, it contributes a count of 0 to each of the
                                      terms Em, Sm. Sm+1..--,                                  St, SO it is not counted on either side of the equation.
                             b) When x satisfies exactly m of the conditions, it is counted once in £,, and once in S,,,
                                      but not in S,,41,....                          S;. Consequently, it is included once in the count for either side
                                      of the equation.
                                c) Suppose x satisfies r of the conditions, where m <r <t. Then x contributes nothing
                                      to Em. Yet it is counted (7) times in Sn, (,,”, ,) times in S,,41,..., and (7) times in
                                      S,, but 0 times for any term beyond S,. So on the right-hand side of the equation, x is
                                      counted (/) — ("F\q'cs)+ (°S2) Gl 2) — 2 + DGLy) (0) times
                                      ForO<k<r-—m,

("2 *)(                     r           )-“2                                             r}
                                                     k       Voniea                                  ktm!            (mm +kir              —m       —k)!
                                                                                               r!                        |                          r!                   (r —m)!
                                                                                                mt       ki(r—m—k!                              mi(r—m)! kr —m—b!

(nl a")
                                                                                                     r\{r—m

Consequently, on the right-hand side of Eq. (1), x is counted

ro") Oa)"
                                                    9) (eo)
                                                             (MUS)
                                                                32) te)
                                        m                0                       m              1                    m            2                                           m}\r—m

m                     0                         ]                      2                                       r—m

= (7)                   n=                  (")-0=osimes,
                                                                     m                                       m

and the formula is verified.

Based on this result, if L,, denotes the number of elements of S (under the hypotheses of
                           Theorem 8.1) that satisfy at least m of the t conditions, then we have the following formula.

COROLLARY 8.2              Lm     =    Sm    ~       a       1) Sm           +       (no       1) Sin2       see               (—1)™            (/1)S:-

Proof: A proof is outlined in the exercises at the end of this section.
                                                                                           8.2 Generalizations of the Principle              401

When mm = 1, the result in Corollary 8.2 becomes

1            2                       aft-1
                                                        Li=s-(          Sot      p/m           HED           >           }S
                                                          = §, —S.4+ 83 -—---+(-))''S,.

Comparing this with the result in Theorem 8.1, we find that

L,=N-N=|S|—-N.

This result is not much of a surprise, because an element x of S is counted in L if it satisfies
                               at least one of the conditions c), c2, c3, ... , Cc; —that is, if x € S and x is not counted in
                               N=     N(€1€2€3     .7   Cr).

Looking back to Example 8.10, we shall find the numbers of systems of two-way roads so
     EXAMPLE 8.11
                               that exactly (£2) and at least (12) two of the villages remain isolated.
                                   The previously calculated results for this example show

Ex = Sy — (7)S3 + (3)S4 — G)Ss = 80 — 3(20) + 6(5) — 10(1) = 40,
                                           Ly = Sp — (7)S3 + Gf) Sa — (7)Ss = 80 — 2(20) + 3(5) — 4(1) = 51.

name cards at the ten places at her table and then leaves to run a
                                                                        last-minute errand. Her husband, Herbert, comes home from his
                                                                        morning tennis match and unfortunately leaves the back door
1. For the situation in Examples 8.10 and 8.11 compute £, for
                                                                        open. A gust of wind scatters the ten name cards. In how many
0 <i <5 and show that }°°_, E, = N = |S|.                               ways can Herbert replace the ten cards at the places at the ta-
2. a) In how many ways can the letters in ARRANGEMENT                   ble so that exactly four of the ten women will be seated where
   be arranged so that there are exactly two pairs of consecutive       Zelma had wanted them? In how many ways will at least four
   identical letters? at least two pairs of consecutive identical       of them be seated where they were supposed to be?
   letters?                                                             7. If 13 cards are dealt from a standard deck of 52, what is
   b) Answer part (a), replacing two with three.                        the probability that these 13 cards include (a) at least one card
3. In how many ways can one arrange the letters in CORRE-               from each suit? (b) exactly one void (for example, no clubs)?
SPONDENTS so that (a) there is no pair of consecutive identi-           (c) exactly two voids?
cal letters? (b) there are exactly two pairs of consecutive             8. The following provides an outline for proving Corollary 8.2.
identical letters? (c) there are at least three pairs of consecu-       Fill in the needed details.
tive identical letters?                                                       a) First note that EF, = L, = S,.
4. Let A = {1,2,3,..., 10}, and B =           {1,2,3,..., 7}. How
                                                                              b) What is £,_,, and how are L, and L,_, related?
many functions f: A — B satisfy | f(A)| = 4? How many have
| f(A)| < 4?
                                                                              c) Show that L,-) = 8,1 — ((=3)S;.
                                                                              d) For all   1<m<t-—41,       how    are    L,,, Lm4j,   and    E,
5. In how many ways can one distribute ten distinct prizes
among four students with exactly two students getting nothing?                related?
How many ways have at least two students getting nothing?                     e) Using the results in steps (a) through (d), establish the
                                                                              corollary by a backward type of induction.
6. Zelma is having a luncheon for herself and nine of the women
in her tennis league. On the morning of the luncheon she places
402         Chapter 8 The Principle of Inclusion and Exclusion

8.3
            Derangements: Nothing
              Is in Its Right Place
                              In elementary calculus the Maclaurin series for the exponential function is given by

x2     3            Sx"
                                                            e Sltxt
                                                                       2!Stat               oe
                              sO

To five places, e~! = 0.36788 and 1 — 1 + (1/2!) — (1/3!) +--- — (1/7!) = 0.36786.
                              Consequently, for all k € Z*, ifk > 7, then }°*_)((—1)”)/n! is a very good approximation
                              toe,
                                   We find these ideas helpful in working some of the following examples.

While at the racetrack, Ralph bets on each of the ten horses in a race to come in according
      EXAMPLE 8.12
                              to how they are favored. In how many ways can they reach the finish line so that he loses
                              all of his bets?
                                  Removing the words horses and racetrack from the problem, we really want to know
                              in how    many   ways    we can arrange the numbers   1, 2, 3,...,   10 so that 1 is not in first
                              place (its natural position), 2 is not in second place (its natural position), ..., and 10 is
                              not in tenth place (its natural position). These arrangements are called the derangements of
                              1,2,3        , 10.
                                 The Principle of Inclusion and Exclusion provides the key to calculating the number
                              of derangements. For each 1 <i      < 10, an arrangement of 1, 2, 3, ...,    10 is said to satisfy
                              condition c; if integer 7 is in the ith place. We obtain the number of derangements, denoted
                              by dio, as follows:

diy = N(\€2€3 +++ Fi9) = 10! — ('?)9! + (P)8! - (3)7! + +++ + (10)0!
                                         = 10![1 ~ ('P)(9!/10 + (2)(8!/10! ~ (2\TYA0! +--+ + (79) 0!/10!)]
                                         = 10!1 —14+ (1/2!) — 4/3) +: -+ (1/10!)] = (10!)(e7!).
                                   The sample space here consists of the 10! ways the horses can finish. So the probability
                              that Ralph will lose every bet is approximately (10!)(e~!)/(10!) = e7!. This probability
                              remains (more or less) the same if the number of horses in the race is 11, 12,.... On the
                              other hand, for n horses, where n > 10, the probability that our gambler wins at least one
                              of his bets is approximately 1 ~ e~' = 0.63212.

The number of derangements of 1, 2, 3, 4 is
      EXAMPLE 8.13
                                                  dy = 41 —14+ (1/2) — 1/3) + 1/49]
                                                     = 4t[(1/2!) — (1/3!) + (1/49] = (4)(3) —441=9.
                              These nine derangements are
                                                                               8.3 Derangements: Nothing Is in Its Right Place          403

2143             3142              4123
                                    2341             3412              4312
                                    2413             3421              4321.
                                Among the 24 — 9 = 15 permutations of 1, 2, 3, 4 that are nor derangements one finds 1234,
                                 2314, 3241, 1342, 2431, and 2314.

Peggy has seven books to review for the C-H Company, so she hires seven people to review
     EXAMPLE 8.14
                                 them. She wants two reviews per book, so the first week she gives each person one book
                                 to read and then redistributes the books at the start of the second week. In how many ways
                                 can she make these two distributions so that she gets two reviews (by different people) of
                                 each book?
                                    She can distribute the books in 7! ways the first week. Numbering both the books and the
                                 reviewers (for the first week) as 1, 2,..., 7, for the second distribution she must arrange
                                 these numbers so that none of them is in its natural position. This she can do in d7 ways.
                                 By the rule of product, she can make the two distributions in (7!)d7 = (7!)7(e7!) ways.

hopes to be finished in time to leave by 9:50 A.M. for another
                         EXERCISES 8.3                                    appointment. What is the probability that Regina will be able
                                                                          to leave on time?
1. In how many ways can the integers 1, 2, 3,..., 10 be ar-
ranged in a line so that no even integer is in its natural position?        9. In how many ways can Mrs. Ford distribute ten distinct
2. a) List all the derangements of 1, 2, 3, 4, 5 where the first         books to her ten children (one book to each child) and then
    three numbers are 1, 2, and 3, in some order.                         collect and redistribute the books so that each child has the
                                                                          opportunity to peruse two different books?
    b) List all the derangements of 1, 2, 3, 4, 5, 6 where the
    first three numbers are 1, 2, and 3, in some order.                   10. a) When» balls, numbered 1, 2, 3, ..., 1 are taken in suc-
3. How many derangements are there for 1, 2, 3, 4, 5?                        cession from a container, a rencontre occurs if the mth ball
                                                                              withdrawn is numbered m, for some 1 < m <n. Find the
4. How many permutations of 1, 2, 3, 4, 5, 6, 7 are not de-
                                                                              probability of getting (i) no rencontres; (ii) (exactly) one
rangements?
                                                                              rencontre, (iii) at least one rencontre; and (iv) r rencontres,
5. a) Let A = {1, 2,3,..., 7}. Afunction f: A > A is said                    where 1 <r <n.
    to have a fixed point if for some x € A, f(x) = x. How
                                                                               b) Approximate the answers to the questions in part (a).
    many one-to-one functions f: A — A have at least one
    fixed point?                                                          11. Ten women      attend a business luncheon. Each         woman
                                                                          checks her coat and attaché case. Upon leaving, each woman is
    b) In how many ways can we devise a secret code by as-
                                                                          given a coat and case at random. (a) In how many ways can the
    signing to each letter of the alphabet a different letter to
                                                                          coats and cases be distributed so that no woman gets either of
    represent it?
                                                                          her possessions? (b) In how many ways can they be distributed
  6. How many derangements of 1, 2, 3, 4, 5, 6, 7, 8 start with           so that no woman gets back both of her possessions?
(a) 1, 2, 3, and 4, in some order? (b) 5, 6, 7, and 8, in some
order?                                                                    12. Ms. Pezzulo teaches geometry and then biology to a class
  7. For the positive integers 1,2, 3,...,4 —1,”, there are
                                                                          of 12 advanced students in a classroom that has only 12 desks.
11,660 derangements where 1, 2, 3, 4, and 5 appear in the first           In how many ways can she assign the students to these desks so
                                                                          that (a) no student is seated at the same desk for both classes?
five positions. What is the value of n?
                                                                          (b) there are exactly six students each of whom occupies the
  8. Four applicants for a job are to be interviewed for 30 min-          same desk for both classes?
utes each: 15 minutes with each of supervisors Nancy and
Yolanda. (The interviews are in separate rooms, and inter-                13. Give acombinatorial argument to verify that for alln € Z*,
viewing starts at 9:00 A.M.) (a) In how many ways can these
interviews be scheduled during a one-hour period? (b) One                nt = (())as + (a + (3 Ja feet ("a = > (ja
applicant, named Josephine, arrives at 9:00 A.M. What is the
probability that she will have her two interviews one after the           (For each | <k <n, d, = the number of derangements of 1,
other? (c) Regina, another applicant, arrives at 9:00 a.m. and            2,3,...,k:dy
                                                                                    = 1)
404            Chapter 8 The Principle of Inclusion and Exclusion

14, a) In how many ways can the integers 1,2, 3,..., n be            15. Answer part (a) of Exercise 14 if the numbers are arranged
    arranged in a line so that none of the patterns 12, 23,          in a circle, and, as we count clockwise about the circle, none of
    34,..., (2 — 1)n occurs?                                         the patterns 12, 23, 34,..., (7 — 1), n1 occurs.
      b) Show that the result in part (a) equals d,_; + dy.          16. What is the probability that the gambler in Example 8.12
      (d, = the number of derangements of 1, 2, 3,..., 7.)           wins (a) (exactly) five of his bets? (b) at least five of his bets?

8.4
                   Rook Polynomials
                                 Consider the six-square “chessboard” shown in Fig. 8.6 (Note: The shaded squares are not
                                 part of the chessboard.). In chess a piece called a rook or castle is allowed at one turn to
                                 be moved horizontally or vertically over as many unoccupied spaces as one wishes. Here
                                 a rook in square 3 of the figure could be moved in one turn to squares 1, 2, or 4. A rook at
                                 square 5 could be moved to square 6 or square 2 (even though there is no square between
                                 squares 5 and 2).
                                     For k € Z* we want to determine the number of ways in which k rooks can be placed on
                                 the unshaded squares of this chessboard so that no two of them can take each other   — that
                                 is, no two of them are in the same row or column of the chessboard. This number is denoted
                                 by rz, or by r,(C) if we wish to stress that we are working on a particular chessboard C.
                                    For any chessboard, r; is the number of squares on the board. Here r; = 6. Two nontaking
                                 rooks can be placed at the following pairs of positions: {1,4}, {1,5}, {2,4}, {2, 6}, {3,5},
                                 {3, 6}, {4, 5}, and (4, 6}, so r2 = 8. Continuing, we find that r3 = 2, using the locations
                                 {1,4,5}    and {2, 4, 6}; 7, = 0, fork > 4.
          5
                                    With ro = 1, the rook polynomial, r(C, x), for the chessboard in Fig. 8.6 is defined as
Figure 8.6                       r(C, x) = 1+ 6x + 8x? + 2x?. For each k > 0, the coefficient of x* is the number of ways
                                 we can place k nontaking rooks on chessboard C.
                                    What we have done here (using a case-by-case analysis) soon proves tedious. As the size
                                 of the board increases, we have to consider cases wherein numbers             such as r4 and rs are
                                 nonzero. Consequently, we shall now make some observations that will allow us to make
                                 use of small boards and somehow break up a large board into smaller subboards.
                                     The chessboard C in Fig. 8.7 is made up of 11 unshaded squares. We note that C consists
                                 of a 2 X 2 subboard C, located in the upper left corner and a seven-square subboard C,
                                 located in the lower right corner. These subboards are disjoint because they have no squares
                                 in the same row or column of C.
                                     Calculating as we did for our first chessboard, here we find

r(Cy, x) = 1+
                                                           4x + 2x”,             r(Co, x) = 14+ 7x + 10x?
                                                                                                       + 2x3,
                                            r(C, x) = 1+ 11x + 40x? + 56x3 + 28x47 +.4x° =r (Cy, x) - (Co, x).

Figure 8.7                          Hence r(C, x) =r(C), x) + r(Co, x). But did this occur by luck or is something happen-
                                 ing here that we should examine more closely? For example, to obtain r3 for C, we need to
                                 know in how many ways three nontaking rooks can be placed on board C. These fall into
                                 three cases:

a) All three rooks are on subboard C2 (and none is on C;): (2)(1) = 2 ways.
                                    b) Two rooks are on subboard C2 and one is on C,: (10)(4) = 40 ways.
                                    c) One rook is on subboard C2 and two are on C): (7)(2) = 14 ways.
                                                                              8.4 Rook Polynomials        405

Consequently, three nontaking rooks can be placed on board C in (2)(1) + (10)(4) +
(7)(2) = 56 ways. Here we see that 56                 arises just as the coefficient of x* does in the product
r(Cj,   x)   . r(Co,   Xx).

In general, if C is a chessboard made up of pairwise disjoint subboards C;, C2,..., Cn,
  then r(C, x) = r(Cy, x)r(Co, x) ---r(Cy, x).
                       _
    The last result for this section demonstrates the type of principle we have seen in other
results in combinatorial and discrete mathematics: Given a large chessboard, break it into
smaller subboards whose rook polynomials can be determined by inspection.

(a)                          (b)                               ()
             Figure 8.8

Consider chessboard C in Fig. 8.8(a). For k > 1, suppose we wish to place k nontak-
ing rooks on C. For each square of C, such as the one designated by («), there are two
possibilities to examine.

a) Place one rook on the designated square. Then we remove, as possible locations for the
     other k — 1 rooks, all other squares of C in the same row or column as the designated
     square. We use C, to denote the remaining smaller subboard [seen in Fig. 8.8(b)].
  b) We do not use the designated square at all. The k rooks are placed on the subboard C,
     [C with the one designated square eliminated — as shown in Fig. 8.8(c)].

Since these two cases are all-inclusive and mutually disjoint,

re(C) = re-1(Cs) + 1x (Ce).
From this we see that

re(C)x* = rea(Cs)x* + ri (Ce)x*.                                        (1)
If n is the number of squares in the chessboard (here n is 8), then Eq. (1) is valid for all
1<k     <n, and we write
                               n                  #

> re(C)x* =         > re-1(Cy)x* +          > re (Ce)x*.                      (2)
                              k=]             k=1                       k=]

For Eq. (2) we realize that the summations may stop before k = n, We have seen cases, as
in Fig. 8.6, where r, and some prior r;’s are 0. The summations start at k = 1, for otherwise
we could find ourselves with the term r_;(C,)x° in the first summand                        on the right-hand
side of Eq. (2).
406         Chapter 8 The Principle of Inclusion and Exclusion

Equation (2) may be rewritten as

So re(C)x* = x Yo re (Cy)! + YO eC)x"                             (3)
                                                      k=]                 k=1                    k=]

or

1+ So re(C)x* = x -r(Cy, x) + DO (Ce)x* + 1,
                                                            k=]                            k=]

from which it follows that

r(C, x) =x-r(Cs, x) +r (Ce, X).                       (4)
                                 We now use this final equation to determine the rook polynomial for the chessboard
                              shown in part (a) of Fig. 8.8. Each time the idea in Eq. (4) is used, we mark the special
                              square we are using with («). Parentheses are placed about each chessboard to denote the
                              rook polynomial of the board.

©                     @                   ®

(>                                    Bl [a - &
                                                 B)B- [Calle
                                              = x7(1 + 2x) + 2x(1 + 4x + 2x7) 4+ x(1 + 3x 4+ x”)

*(G)+
                                              = 3x + 12x24 7x3 4+-x(1 + 2x) 4+ (1 4+ 4x + 2x7) = 14 8x + 16x? + 7x7.
                                                                                                                        (e )
                            8.5
  Arrangements with Forbidden Positions
                              The rook polynomials of the previous section seem interesting on their own. Now we shall
                              find them useful in solving the following problems.

In making seating arrangements for their son’s wedding reception, Grace and Nick are
      EXAMPLE 8.15
                              down to four relatives, denoted R;, for 1 <i <4, who do not get along with one another.
                              There is a single open seat at each of the five tables T;, where 1 < 7 <5. Because of family
                              differences,
                                   a) R, will not sit at T; or To.                  b) R> will not sit at T>.
                                   c) R3 will not sit at T3 or Ty.                  d) R, will not sit at Ty or Ts.
                                                          8.5 Arrangements with Forbidden Positions              407

This situation is represented in Fig. 8.9. The number of ways we can seat these four
people at four different tables, and satisfy conditions (a) through (d), is the number of ways
four nontaking rooks can be placed on the chessboard made up of the unshaded squares.
However, since there are only seven shaded squares, as opposed to thirteen unshaded ones,
it would be easier to work with the shaded chessboard.

Ty      Tz   T3   Tq    Ts

Figure 8.9

We start with the conditions that are required for us to apply the Principle of Inclusion and
Exclusion: For each 1 <i < 4, let c; be the condition where a seating assignment of these
four people (at different tables) is made with relative R; in a forbidden (shaded) position.
As usual, |.S| denotes the total number of ways we can place the four relatives, one to a
table. Then |$| = N = Sp =S!
    To determine S; we consider each of the following:

e   N(c,)        = 4!+4!,    for there are 4! ways          to seat Ro, R3, and Ry if R;         is in forbidden
  position T; and another 4! ways if Ry is at table T;, his or her other forbidden position.
  e N(c2) = 4!, for after placing R2 at forbidden table T2, we must place R,, R3, and Ry
  at T,, T3, T4, and Ts, one person to a table.
  e N(c3) = 4! + 4!, one summand for R3 being in forbidden position T3, and the other
  summand for R3 being in the forbidden position Ty.
  @ N(c4) = 4!+4+ 4!, each of the two summands arising when Ry, is placed at each of the
  two forbidden positions T, and Ts.

Hence S$; = 7(4!).
   Turning to S we have these considerations:

@ N(c\c2) = 3!, because after we place R; at T)                        and R> at T>, there are three tables
  (T3, Ty, and T;) where R3 and Ry can be seated.
  @ N(cic3) = 3! + 3! + 3! + 3!, because there are four cases where R; and R3 are located
  at forbidden positions:
              i)   R,   at T,; R3   at T;                          ii)   R,   at To; R3   at T;
            iii)   R,   at T,; R3   at Ty                          iv)   R;   at T2; R3   at Ty.

In a similar manner we find that N(c)cq4) = 4(3!), N(c20e3) = 2(3!), N(e2c4)                             = 2(3)),
and N(c3c4) = 3(3!). Consequently, S. = 16(3!).
   Before      continuing,     we    make     a few      observations    about    S;   and   S$. For   S;   we   have
7(4!) = 7(5 — 1)!, where 7 is the number of shaded squares in Fig. 8.9. Also, Sy = 16(3!) =
16(5 — 2)!, where 16 is the number of ways two nontaking rooks can be placed on the
shaded chessboard.
   In general, for all 0 <i         <4,     S$; =1r,(5 —7)!, where r; is the number of ways in which it
is possible to place i nontaking rooks on the shaded chessboard shown in Fig. 8.9.
408         Chapter 8 The Principle of Inclusion and Exclusion

Consequently, to expedite the solution of this problem, we turn to r(C, x), the rook
                              polynomial of this shaded chessboard. Using the decomposition of C into the disjoint
                              subboards in the upper left and lower right corners, we find that

r(C,x) = (1 +3x4+x°)(1              + 4x4 3x7) = 14+ 7x + 16x? + 13x? + 3x4,
                              SO
                                     N(€\€2€3C4) = Sy — Sy + Sy — 83 + Sy = 5! — 7(4') + 163!) — 13(2!) + 31)
                                                           4

= YiEvind ~i)!=25.
                                                          i=0
                                 Grace and Nick can breathe a sigh of relief. There are 25 ways in which they can seat
                             these last four relatives at the reception and avoid any squabbling.

The next example demonstrates how a bit of rearranging of our chessboard can help in
                             our calculations.

We have a pair of dice; one is red, the other green. We roll these dice six times. What is the
      EXAMPLE 8.16
                             probability that we obtain all six values on both the red die and the green die if we know
                             that the ordered pairs (1, 2), (2, 1), (2, 5), (3, 4), (4, 1), (4, 5), and (6, 6) did not occur?
                             [Here an ordered pair (a, b) indicates a on the red die and b on the green.]
                                 Recognizing this problem as one dealing with permutations and forbidden positions,
                             we construct the chessboard shown in Fig. 8.10(a), where the row labels represent the
                             outcome on the red die, the column labels the outcome on the green die, and the shaded
                             squares constitute the forbidden positions. In this figure the shaded squares are scattered.
                             Relabeling the rows and columns, we can redraw the chessboard as shown in Fig. 8.10(b),
                             where we have taken shaded squares in the same row (or column) of the board shown in
                             part (a) and made them adjacent. In Fig. 8.10(b), the chessboard C (of seven shaded squares)
                             is the union of four pairwise disjoint subboards, and so

r(C, x) = 1 +4x +2x7)(1 +x)? = 1 47x + 17x? + 19x37 4+ 10x4 + 229,

1    2         3   4   5    6              1   5   3   4   2   6
                                             1                                          1

2                                          2

3                                          4
                                            4                                          3

5                                          5

6                                          6

(a)                                        (b)
                                       Figure 8.10

For each 1 <i < 6, define c; as the condition where, having rolled the dice six times,
                             we find that all six values occur on both the red die and the green die, but i on the red die
                                                                 8.5 Arrangements with Forbidden Positions           409

is paired with one of the forbidden numbers on the green die. [Note that N(cs)                = 0.] Then
               the number of (ordered) sequences of the six rolls of the dice for the event we are interested
               in is

i=0               i=0
                                            = 6![6! — 7(5!) + 17(4!) — 193) + 10@!) — 20) + 0(0)]
                                            = 6![192] = 138,240.

Since the sample space consists of all sequences of six ordered pairs selected with
               repetition from the 29 unshaded squares of the chessboard, the probability of this event is
               138,240/(29)® = 0.00023.

Our last example provides a unifying idea for what we have done in this section.

Let A = {1, 2, 3, 4}andB = {u, v, w, x, y, z}. How many one-to-one functions f: A >                       B
EXAMPLE 8.17
               satisfy none of the following conditions:
                 cy: fl)     =uore         cz: f(2)
                                                  = w             c3: f3)    =worx           ca: f(4) =x,     y,   orz

As in our two prior examples, we construct a chessboard, as shown in Fig. 8.11. Here
               we are really interested in the chessboard C made up of the eight shaded squares (which
               comprise two disjoint subboards). Now

r(C, x) = (1 +2x)(1 + 6x + 9x? + 2x3) = 14 8x + 21x? + 20x07 + 4x7.
                  So

N (€:02€3C4)   = Sy — S; + So — S3 + Sy

= (61/2!) — 8(5!/2!) + 21(4!/2") — 20(31/2!) + 4(2!/2!)
                                             4

= SOE-Dir6 —1!/2! = 76
                                            i=0

and there are 76 one-to-one functions f: A >             B where none of the conditions c¢;, ¢2, ¢3,
               c4 is satisfied.

1

2

3

4

Figure 8.11

Even more so, look back at N (€;€2€3C4) in Example 8.15. Disregarding the vocabulary of
               the “relatives” and “tables,” we realize that we are counting the number of one-to-one func-
               tions g: {R;, Ro, R3, Ra} >        {T), To, T3, Ts, Ts} where none of the conditions c1, ¢2, ¢3, C4
410            Chapter 8 The Principle of Inclusion and Exclusion

Finally, for A = {1, 2, 3, 4, 5, 6, 7, 8}, suppose we want to count the number of
                                 one-to-one functions : A > A where h(i) # i for all i € A. Here the rook polynomial
                                 would be
                                                                                               8
                                                                    r(C, x)= (+x) =            > (;)*
                                                                                             k=0

(el yrs ()e-Oe -(
                                 and we find that the number of such one-to-one functions / is

1                     !
                                                       =afii4               — a te         +a8}
                                                       = dg,   the number of derangements of 1, 2,3,..., 8.

D(a Gh AS ee ER
                                                                                                   |
1. Verify directly the rook polynomials for (a) the unshaded
chessboards in Figs. 8.7 and 8.8(a), and (b) the shaded chess-
boards in Figs. 8.9 and 8.10(b).
2. Construct or describe a smallest (least number of squares)                (1)                                  (11)
chessboard for which rig 4 0.
3. a) Find the rook polynomial for the standard 8 X 8 chess-
      board.
      b) Answer part (a) with 8 replaced by n, forn € Z”.
4, Find the rook polynomials for the shaded chessboards in
Fig. 8.12.                                                                   (itl)                                 (1v)

Figure 8.13
       Cy:                         C3:

and Charles both dislike SQL, Sandra wants to avoid C++ and
                                                                        VHDL. Paul detests Java and C++, and Todd refuses to work
                                                                        in SQL and Perl. In how many ways can Professor Ruth assign
                                                                        each grader to correct programs in one language, cover all five
      Figure 8.12                                                       languages, and keep everyone content?
                                                                         8. Why do we have 6! in the term (6!)N (€)c)       - - +)   for the
5. a) Find the rook polynomials for the shaded chessboards             solution of Example 8.16?
      in Fig. 8.13.
                                                                          9, Five professors named Al, Violet, Lynn, Jack, and Mary Lou
      b) Generalize the chessboard (and rook polynomial) for            are to be assigned to teach one class each from among calcu-
      Fig. 8.13().                                                      lus I, calculus I, calculus Il, statistics, and combinatorics. Al
6. a) Let C be a chessboard that has m rows and n columns,             will not teach calculus II or combinatorics, Lynn cannot stand
    with m <n (for a total of mn squares). For 0 <k <m, in              statistics, Violet and Mary Lou both refuse to teach calculus f
    how many ways can we arrange k (identical) nontaking                or calculus III, and Jack detests calculus II.
    rooks on C?                                                               a) In how many ways can the head of the mathematics de-
      b) For the chessboard C in part (a), determine the rook                 partment assign each of these professors one of these five
      polynomial r(C, x).                                                     courses and still keep peace in the department?
7. Professor Ruth has five graders to correct programs in her                 b) For the assignments in part (a), what is the probability
courses in Java, C++, SQL, Perl, and VHDL. Graders Jeanne                      that Violet will get to teach combinatorics?
                                                                                       8.6 Summary and Historical Review            4il

10. A pair of dice, one red and the other green, is rolled six          @ Woman 2 would not be compatible with man 2 or 4.
times. We know that the ordered pairs (1, 1), (1, 5), (2. 4),
                                                                       @ Woman 3 would not be compatible with man 3 or 6.
(3, 6), (4, 2), (4, 4), (5. 1), and (5, 5) did not come up. What is
the probability that every value came up on both the red die            @ Woman 4 would not be compatible with man 4 or 5.
and the green one?                                                        In how many ways can the service successfully match each
11. A computer dating service wants to match each of four             of the four women with a compatible partner?
women with one of six men. According to the information these         12.   For A = {1, 2, 3, 4, 5} and   B = {u, v, w, x, y, z}, deter-
applicants provided when they joined the service, we can draw         mine the number of one-to-one functions f: A— B where
the following conclusions.                                            fC) #v,w; f(2) Au, w; FG) # x: and f(4) Fv, x, y.
@   Woman   | would not be compatible with man       1, 3, or 6.

8.6
       Summary and Historical Review
                                In the first and third chapters of this text we were concerned with enumeration problems
                                in which we had to be careful of situations wherein arrangements or selections were over-
                                counted. This situation became even more involved in Chapter 5 when we tried to count
                                the number of onto functions for two finite sets.
                                    With Venn diagrams to lead the way, in this chapter we obtained a pattern called the
                                Principle of Inclusion and Exclusion. Using this principle, we restated each problem in terms
                                of conditions and subsets. Using enumeration formulas on permutations and combinations
                                that were developed earlier, we solved some simpler subproblems and let the principle
                                manage our concern about overcounting. As a result, we were able to solve a variety of
                                problems, some dealing with number theory and one with graph theory. We also proved the
                                formula conjectured earlier in Section 5.3 for the number of onto functions for two finite
                                sets.

This principle has an interesting history, being found in different manuscripts under such
                                names as the “Sieve Method” or the “Principle of Cross Classification.” A set-theoretic
                                version of the principle, which concerned itself with set unions and intersections, is found
                                in Doctrine of Chances (1718), a text on probability theory by Abraham DeMoivre (1667—
                                1754). Somewhat earlier, in 1708, Pierre Rémond de Montmort (1678-1719) used the idea
                                behind the principle in his solution of the problem generally known as le probléme des
                                rencontres (matches). (In this old French card game the 52 cards in a first deck are arranged
                                face up in a row  — perhaps on a table. Then the 52 cards of a second deck are dealt, with
                                one new card being placed on each of the 52 cards previously arranged on the table top.
                                The score for the game is determined by counting the resulting matches, where both the
                                suit and the face value for each of the two cards must match.)
                                    Credit for the way we developed and dealt with the Principle of Inclusion and Exclusion
                               belongs to James Joseph Sylvester (1814-1897). (This colorful English-born mathemati-
                               cian also made major contributions in the theory of equations; the theory of matrices and
                               determinants; and invariant theory, which he founded with Arthur Cayley (1821-1895).
                               In addition Sylvester founded the American Journal of Mathematics, the first American
                               journal established for mathematical research.) The importance of the inclusion-exclusion
                               technique was not generally appreciated, however, until somewhat later, when the publica-
                                tion Choice and Chance by W. A. Whitworth [10] made mathematicians more aware of its
                                potential and use.
412   Chapter 8 The Principle of Inclusion and Exclusion

James Joseph Sylvester (1814-1897)

For more on the application of this principle, examine Chapter 4 of C. L. Liu [4], Chapter
                        2 of H. J. Ryser [8], or Chapter 8 of A. Tucker [9]. More number-theoretic results related
                        to the principle, including the Mébius inversion formula, can be found in Chapter 2 of
                        M. Hall [1], Chapter X of C. L. Liu [5], and Chapter 16 of G. H. Hardy and E. M. Wright
                        [3]. An extension of this formula is given in the article by G. C. Rota [7].
                            The article by D. Hanson, K. Seyffarth, and J. H. Weston [2] provides an interesting
                        generalization of the derangement problem discussed in Section 8.3. The ideas behind
                        the rook polynomials and their applications were developed in the late 1930s and dur-
                        ing the 1940s and 1950s. Additional materia! on this topic is found in Chapters 7 and 8 of
                        J. Riordan [6].

REFERENCES

. Hall, Marshall, Jr. Combinatorial Theory. Waltham, Mass.: Blaisdell, 1967.
                               . Hanson, Denis, Seyffarth, Karen, and Weston, J. Harley. “Matchings, Derangements, Rencon-
                                 tres.” Mathematics Magazine 56, no. 4 (September 1983): pp. 224-229.
                               . Hardy, Godfrey Harold, and Wright, Edward Maitland. An Introduction to the Theory of Num-
                               bers, 5th ed. Oxford: Oxford University Press, 1979.
                              . Liu, C, L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
                              .Liu, C. L. Topics in Combinatorial Mathematics. Mathematical Association of Amer-
                                ica, 1972.
                              . Riordan, John. An Introduction to Combinatorial Analysis. Princeton, N.J.: Princeton Univer-
                                 sity Press, 1980. (Originally published in 1958 by John Wiley & Sons.)
                              . Rota, Gian Carlo, “On the Foundations of Combinatorial Theory, I. Theory of Mébius Func-
                                 tions.” Zeitschrift fiir Wahrscheinlichkeits Theorie 2 (1964): pp. 340-368.
                               . Ryser, Herbert J. Combinatorial Mathematics. Carus Mathematical Monograph, No. 14.
                                 Published by the Mathematical Association of America, distributed by John Wiley & Sons,
                                 New York, 1963.
                               . Tucker, Alan. Applied Combinatorics, 4th ed. New York: Wiley, 2002.
                               . Whitworth, William Allen. Choice and Chance. Originally published at Cambridge in 1867.
                                 Reprint of the Sth ed. (1901), Hafner, New York, 1965.
                                                                                                 Supplementary Exercises            413

9. If an arrangement of the letters in SURREPTITIOUS is
           SUPPLEMENTARY EXERCISES                                   selected at random, what is the probability that it contains
                                                                     (a) (exactly) three pairs of consecutive identical letters? (b) at
                                                                     most three pairs of consecutive identical letters?
1. Determine how many n € Z* satisfy n < 500 and are not            10. In how many ways can four w’s, four x’s, four y’s, and four
divisible by 2, 3, 5, 6, 8, or 10.                                   z's be arranged so that there is no consecutive quadruple of the
2. How many integers n are such that 0 <n < 1,000,000 and           same letter?
the sum of the digits in n is less than or equal to 37?              11. a) Given v distinct objects, in how many ways can we se-
  3. At next week's church bazaar, Joseph and his cousin Jeffrey         lect r of these objects so that each selection includes some
must arrange six baseballs, six footballs, six soccer balls, and         particular m of the n objects? (Here m <r <n.)
six volleyballs on the four shel ves in the sports booth sponsored
                                                                         b) Using the Principle of Inclusion and Exclusion, prove
by their Boy Scout troop. In how many ways can they do this              that form <r<n7,
so that there are at least two, but no more than seven, balls on
each shelf? (Here all six balls for any one of the four sports are
identical in appearance.)
4. Find the number of positive integers n where 1 < n < 1000
                                                                                     Cr) Lore)
                                                                     12. a) Let A € Z*. If we have A different colors available, in
and n is not a perfect square, cube, or fourth power.                    how many ways can we color the vertices of the graph
5. In how many ways can we arrange the integers |, 2, 3,                shown in Fig. 8.14(a) so that no adjacent vertices share the
..., 8 ina line so that there are no occurrences of the patterns         same color? This result in 4 is called the chromatic polyno-
12, 23,..., 78, 81?                                                      mial of the graph, and the smallest value of 4 for which the
                                                                         value of this polynomial is positive is called the chromatic
6. a) If we have k different colors available, in how many
                                                                         number of the graph, What is the chromatic number of this
    ways can we paint the walls of a pentagonal room if adja-
                                                                         graph? (We shall pursue this idea further in Chapter 11.)
    cent walls are to be painted with different colors?
                                                                         b) If there are six colors available, in how many ways can
    b) What is the smallest value of k for which such a coloring
                                                                         the rooms R,, 1 <i <5, shown in Fig. 8.14(b) be painted
    is possible?
                                                                         so that rooms with a common doorway, D,,          1 < j <5, are
7. Ten students take a physics test in a certain room. When             painted with different colors?
the test is over the students take a break and then return to the
                                                                     13. Find the number of ways to arrange the letters in LAPTOP
room to discuss their answers to the test questions. If there are
14 chairs in this room, in how many ways can the students seat       so that none of the letters L, A, T, O is in its original position
                                                                     and the letter P is not in the third or sixth position.
themselves after the break so that no one is in the same chair
he, or she, occupied during the test?                                14. Forn € Z* prove that if @(n) = n — | then n is prime.
8. Using the result of Theorem 8.2, prove that the number of        15. Let Djs denote the set of positive divisors of 18. For d €
ways we can place s different objects in 7 distinct containers       Dyg let Sy = {n|O<n <18         and ged(n, 18) = a}. (a) Show
with m containers each containing exactly r of the objects is        that the collection S;, d € Djs, provides a partition of {1, 2,
                                                                     3,4,..., 17, 18}. (b) Note that |S;| = 6 = @(18) and |$3| =
         (-l"nlst <          (—1)'(n —i)s-”
                                                                     6 = (9). For each d € Djx, express |S,| in terms of Euler's
            m!    a
                  i=m
                      Gi —m)'(n   — DMs —ir)(r!y"                    phi function.

(a)
                  Figure 8.14
414          Chapter 8 The Principle of Inclusion and Exclusion

16. For m € Z* let D,, = {d € Z*\d divides m}. For d € D,,        ranged on four shelves in her office with all books on any one
let Sz = {n|O <n <m and gcd(n, m) = d}. (a) Show that the         subject on its own shelf. When her office is cleaned, the 48
collection S;, d € D,,, provides a partition of {1, 2,3, 4,...,   books are taken down and then replaced on the shelves— once
m — 1, m}. (b) Determine |S,| for each d € D,.                    again with all 12 books on any one subject on its own shelf.
17. If n € Z*, prove that (a) @(2n) = 26(n) when n is even;       In how many ways can this be done so that (a) no subject is
and (b) @(2n) = @(n) when nvis odd.                               on its original shelf? (b) one subject is on its original shelf?
                                                                  (c) no subject is on its original shelf and no book is in its orig-
18. Let a, b, c€ Z* with c = gcd(a, b). Prove that
                                                                  inal position? [For example, the book originally in the third
                  b(ab)o(c) = b(a)d(b)ec.                         (from the left) position on the first shelf must not be replaced
19. Caitlyn has 48 different books: 12 each in mathematics,       on the first shelf and must not be in the third (from the left)
chemistry, physics, and computer science. These books are ar-     position on the shelf where it is placed.}
     Generating
        Functions

I:  this chapter and the next, we continue our study of enumeration, introducing at this time
                  the important concept of the generating function.
                   The problem of making selections, with repetitions allowed, was studied in Chapter 1.
                There we sought, for example, the number of integer solutions to the equation c; + cz +
                c3 + c4 = 25 where c¢; > 0 for all 1 < i < 4. With the Principle of Inclusion and Exclusion,
                 in Chapter   8, we were    able to solve a more   restricted version of the problem,   such as
                 Cy too +03 +4 = 25 with 0 < c; < 10 forall 1 <i <4. If, in addition, we wanted c to
                 be even and ¢3 to be a multiple of 3, we could apply the results of Chapters 1 and 8 to
                 several subcases.
                    The power of the generating function rests upon its ability not only to solve the kinds of
                 problems we have considered so far but also to aid us in new situations where additional
                 restrictions may be involved.

9.1
      Introductory Examples
                 Instead of defining a generating function at this point, we shall examine some examples
                 that motivate the idea.

While shopping one Saturday, Mildred buys 12 oranges for her children, Grace, Mary, and
EXAMPLE 9.1      Frank. In how many ways can she distribute the oranges so that Grace gets at least four,
                 and Mary and Frank get at least two, but Frank gets no more than five? Table 9.1 lists

Table 9.1

G       M   F          G     M      I

4      3   5          6     2      4
                                                4      4   4          6     3      3
                                                4      5   3          6     4      2
                                                4      6   2          7     2      3
                                                5      2   5          7     3      2
                                                5      3   4          8     2      2
                                                5      4   3
                                                5      5   2

415
416        Chapter 9 Generating Functions

all the possible distributions. We see that we have all the integer solutions to the equation
                            ¢) too +03       = 12 where 4<c),2<co,and2        <c3    <5.
                               Considering the first two cases in this table, we find the solutions 4 + 3 + 5 = 12 and4+
                            4 +4 = 12, Now where in our prior algebraic experiences did anything like this happen?
                            When multiplying polynomials we add the powers of the variable, and here, when we
                            multiply the three polynomials,
                                     (x4 +>     +x   4 x7 + x8)(x? tx       txttxh     tx) Ox? 4x7? 4x44 2°),

two of the ways to obtain x'? are as follows:

1) From the product x*x7x°, where x‘ is taken from (x4 +. x° + x8 +474 x8), x? from
                                   (x? +23 +444 2%° + x°), and x? from (x7 + x? + x4 +x°).
                               2) From the product x4x*x*, where the first x4 is found in the first polynomial, the
                                  second x‘ in the second polynomial, and the third x* in the third polynomial.
                               Examining the product
                                    (x4 x5 tx       ox? $x 8)cx2 4          Hat      5 +    cx? 4 3 4 xt + 9)

more closely, we realize that we obtain the product x‘x/x* for every triple (i, j, k) that
                            appears in Table 9.1. Consequently, the coefficient of x!? in
                                 f(x) = te xP        4x8 4x7 + x8)? tx            taxtg x   4 2%)? 4%       + x4 4 x?)
                            counts the number of distributions     — namely, 14—that we seek. The function f(x) is
                            called a generating function for the distributions.
                                But where did the factors in this product come from?
                                The factor x* + x° + x®° + x’ + x°, for example, indicates that we can give Grace 4 or
                            5 or 6o0r7 or 8 of the oranges. Once again we make use of the interplay between the exclusive
                            or and ordinary addition. The coefficient of each power of x is 1 because, considering the
                            oranges as identical objects, there is only one way to give Grace four oranges, one way
                            to give her five oranges, and so on. Since Mary and Frank must each receive at least two
                            oranges, the other terms (x* + x? + x4 4+.x° + x®) and (x? + x7 +. x4 4+.x°) start with x?,
                            and for Frank we stop at x° so that he doesn’t receive more than five oranges. (Why does
                            the term for Mary stop at x°?)
                                Most of us are reasonably convinced now that the coefficient of x'!* in f(x) yields the
                            answer. Some, however, may be a bit skeptical about this new idea. It seems that we could
                            list the cases in Table 9.1 faster than we could multiply out the three factors in f(x) or
                            calculate the coefficient x!* in f (x). At present that may seem true. But as we progress to
                            problems with more unknowns and larger quantities to distribute, the generating function
                            will more than demonstrate its worth. (The reader may realize that the rook polynomials of
                            Chapter 8 are examples of generating functions.) For now we consider two more examples.

If there is an unlimited number (or at least 24 of each color) of red, green, white, and black
      EXAMPLE 9.2
                            jelly beans, in how many ways can Douglas select 24 of these candies so that he has an
                            even number of white beans and at least six black ones?
                                The polynomials associated with the jelly bean colors are as follows:
                                 e red (green);      1+x+x?+.---+.x%4, where the leading 1 is for 1x°, because one
                                    possibility for the red (and green) jelly beans is that none of that color is selected
                                  © white:           (L+x?+xt+x°4---4+
                                                                    2%)
                                 e black:            (x8 4+ x74 x84... 4x74)
                                                                                                        9.1   Introductory Examples           417

So the answer to the problem is the coefficient of x74 in the generating function

Fx) =A 4+ xt xr te tard                       par pat          ge         aye       txt ee +x).
                                    One such selection is five red, three green, eight white, and eight black jelly beans. This
                                 arises from x? in the first factor, x? in the second factor, and x® in the last two factors.

One more example before closing this section!

How many integer solutions are there for the equation c; + c2 + ¢3 + c4 = 25 if 0 < ¢; for
     EXAMPLE 9.3
                                 al 1 <i<4?
                                    We can alternatively ask in how many ways 25 (identical) pennies can be distributed
                                 among four children.
                                    For each child the possibilities can be described by the polynomial 1 + x + x? +. x7 +
                                 .. +++ x°5, Then the answer to this problem is the coefficient of x in the generating function

fx) = Ctx tre tee $x?)
                                    The answer can also be obtained as the coefficient of x” in the generating function

g(x) = (Lt   x tx? tx peep axP $x                     $-- 4,
                                 if we rephrase the question in terms of distributing, from a large (or unlimited) number of
                                 pennies, 25 pennies among four children. [Whereas f(x) is a polynomial, g(x) is a power
                                 series in x.] Note that the terms x*, for all k > 26, are never used. So why bother with them?
                                 Because there will be times when it is easier to compute with a power series than with a
                                 polynomial.

b) Find the generating function for the number of ways to
                           EXERCISES 9.1                                      select, with repetitions allowed, r objects from a collection
                                                                              of n distinct objects.
1. For each of the following, determine a generating function
                                                                          4. a) Explain why the generating function for the number of
and indicate the coefficient in the function that is needed to solve
the problem. (Give both the polynomial and power series forms                ways to have n cents in pennies and nickels is
of the generating function, wherever appropriate.)                                   G+xt rex 4.) $x 4x 4--5,
    Find the number of integer solutions for the following equa-              b) Find the generating function for the number of ways to
tions:                                                                        have nx cents in pennies, nickels, and dimes.
   a) cy +o    +03 +4
                    = 20,0<c¢, <7 forall 1 <i <4
                                                                           5. Find the generating function for the number of integer
   b) cy) ter +63 + c4 = 20,0 <c, for all 1 <i < 4, withcp                 solutions to the equation c; +¢.+¢3;+cq4                   = 20   where
   and c3 even                                                             —3<¢,,-3<@, -5 <c; <5, and0      < cy.
   C) er +e. +03     +04   +5   = 30,2 <c,     <4     and 3<c,      <8     6. For S = {a, b, c}, consider the function
   forall2<i<5
                                                                                     f(x) = (1 + ax) + bx)(1 + ex)
   d) cy tonto;      +e4+ce5    = 30,0<c,       for   all     1<i <5,
                                                                                          =1+ax+bx    +cx + abx* + acx?
   with c> even and c; odd

2. Determine the generating function for the number of ways
                                                                                              + bex® + abex’.
to distribute 35 pennies (from an unlimited supply) among five                Here, in f(x)
children if (a) there are no restrictions; (b) each child gets at
                                                                            © The coefficient of x° is 1 —for the subset @ of S.
least 1¢; (c) each child gets at least 2¢; (d) the oldest child gets
at least 10¢; and (e) the two youngest children must each get at            ® The coefficient of x' is a + b+ c—for              the subsets {a},
least 10¢.                                                                    {b}, and {c} of S.
3. a) Find the generating function for the number of ways to                ® The coefficient of x? is ab + ac +bc—for the subsets
   select 10 candy bars from large supplies of six different kinds.            {a, b}, {a, c}. and {b, c} of S.
    418               Chapter 9 Generating Functions

© The coefficient of x? is abc
                                  — for the subset {a, b, c} = S.                           a) Give the generating function for the subsets of

Consequently, f(x) is the generating function for the sub-                                          S=   {a b,c,....7,5,t}.
    sets of S. For when we calculate f (1), we obtain a sum wherein                         b) Answer part (a) for selections wherein each element can
    each    of the   eight   summands     corresponds   with   a subset       of S;         be rejected or selected as many       as three times.
    the summand | corresponds with @. {If we go one step further
    and seta = b=c = 1 in f(x), then f(1) = 8, the number of
    subsets of S.]

9.2
                     Definition and Examples:
                     Calculational Techniques
                                          In this section we shall examine a number of formulas and examples dealing with power
                                          series. These will be used to obtain the coefficients of particular terms in a generating
                                          function.
                                              We start with the following concept.

Definition 9.1                Let ao, a), a2, .. . be a sequence of real numbers. The function
                                                                                                                     xO           ;

f(x) = ay Fayx Fagx? +---=                 > a;x'
                                                                                                                     i=0
                                          is called the generating function for the given sequence.

Where could this idea have come from?

ror -()eleCer a6)
           EXAMPLE9.4             |       Toranyn eZ",
                                                          +

(        x       =(j          i}*        7               ("\x"

(00) )Ce.m.
                                          so (1+ x)” is the g generating g function for the sequence
                                                                                              q

|          EXAMPLE 9.5                        a) Forn €Z*,
                                                                                    )+4+--- +2").
                                                                     (d—x"™*!)=(l-x42° x4x7
                                                So

J— x"!                  )               ,
                                                                                       i_      =Ltxtxet---            +x",

and (1 — x”*')/(1 — x) is the generating function for the sequence 1, 1, 1,..., 1, 0,
                                                0, 0, ..., where the first 7 + 1 terms are 1.
                                             9.2 Definition and Examples: Calculational Techniques   419

b) Extending the idea in part (a), we find that

1=(—x)(l+x4x72?4+224+x4+.-.,),

SO
                                                              1
                                                           l—-x
   is the generating function for the sequence 1, 1, 1, 1,....[Notethat1/(1 ~x) =1+4+
   x+x7+4+x34.--.- is valid for all real x where |x| < 1; it is for this set of values that
   the geometric series 1 +x +x* +x°+--- converges. In our work with generating
   functions we shall be primarily concerned with the coefficients of the powers of x.
   However, later in Example 9.18, we shall use this and two other related series to
   evaluate infinite sums for values within the set of values for which each such infinite
   series converges. }
c) With

Figg                                  ge                  x,

lx                                              i=0
   taking the derivative yields

ql        = fa           —x)'=(-1l)d—x)?(-l                         =
            dx1—x        dx                                                           (1 — x)?
                            d
                      =F             tx teh tx tes) =142x43x7+4x34..
                                x
   Consequently,
                                                              ]
                                                          (1 ~ x)?
   is the generating function for the sequence 1, 2, 3, 4,..., while

ape                     Ot         F248            arth
                         —x
   is the generating function for the sequence 0, 1, 2, 3,....
d) Continuing from part (c),

d            x                d
                                                          (Otx+2x74+3x7+--,),
                      dx (           ~x)             dx
   or
                                         1
                            oy                  =142743'x2
                                                      4 42x34...
   Hence,

x+1
                                                          (1 —x)3
   generates 1*, 27, 3°,..., and
                                                          x(x +1)
                                                          (1 —x)3
   generates 0°, 17, 27, 37,....
420   Chapter 9 Generating Functions

e) Now let us take one more look at the results in parts (b), (c), (d) — along with some
                            extensions. But this time we have a change in the notation:

fo)          1
                                                                  = TF             Lex tx 24 + x7 13 fe:

fi)x) =x
                                                              = x—fy fo(x) == ———;
                                                                 dx°”         (1— x)?
                                                                  =O+x4+2x°+3x3+--.
                                                                        d                 x? +x
                                                        fro(x) = wh             (x) Gap
                                                                                    =

= 0? + 12x 427x274 32x34...
                                                                        d                we4+4x7 4x
                                                        AQ) = x                fale) =
                                                                            Xx             (1 —x)4
                                                                  =O 4+ 13x 423x774 32x74---

oy       a                     xt + 11x74 11x? +x
                                                        Sa(x) = X75,            OD) =
                                                                                                  (1
                                                                                                   — x)?
                                                                  = 01 + I4x 4 24x74 34x34...
                       Now look at the output for the Maple code in Fig. 9.1. Here we find the numerators for
                       fo(x), fix), .... fax), along with those for fs(x) and f6(x) [where the denominators
                       are (1 — x)® and (1 — x)’, respectively]. The coefficients for these numerators are exactly
                       the Eulerian numbers we introduced in Example 4.21. We choose not to pursue this here,
                       but the interested reader, who wants to examine this connection further, should look into
                       reference [4].

£ | | 0 (2)          1/(1-x);
                                    Vv

1
                                                                                                  f0(x) :=——
                                                                                                          l-x
                                         for    i      from   1    to       6 do
                                               £| | i(x)             simplify (x*diff                         (£||(i-1) (x),x)):
                                               print (sort (expand ((-1) * (i+1)*numer(f£||i(x))))):
                                         od:
                                                                                                          x
                                                                                                      2
                                                                                                     x +X
                                                                                                  et 4r4x
                                                                                             + llxi tll xtx
                                                                                         0 +26x°+66x             + 26x +x

L                                                  04570 +302 x9 +302 2° 457                +x

Figure 9.1
                                                    9.2 Definition and Examples: Calculational Techniques          421

EXAMPLE   9.6     a) Rewriting the result in part (b) of Example 9.5, we have

——Pe =l+ytytyt:::.
                                                              2443
                                                   1-y
                      Upon substituting 2x for y, we then learn that

Prag TIF AN + Ax                 + Oxy $0 = 1+ 2x $2707 42)                        Hoo,
                              —   LX

so 1/(1 — 2x) is the generating function for the sequence 1 (= 2°), 2 (= 2!), 2?,
                      2°, ....In fact, foreacha € R, it follows that 1/(1 — ax) = 1 + (ax) + (ax)? + (ax)
                      +e++=1tax+a’*x*+a3x7°+---,80 1/(1 — ax) is the generating function for
                      the sequence 1 (= a°),a(=a'),       a’, a*,.... [Here we want 0° = 1 for the case where
                      a=0.]
                  b) Again, from part (b) of Example 9.5, we know that the generating function for the
                     sequence 1, 1,1,1,...is f(x) = 1/(1 — x). Therefore the function

1
                                                   g(x) = f(x) —x° =                x
                      is the generating function for the sequence 1, 1, 0, 1, 1, 1, ..., while the function

h(x) = f(x) + 2x7 3 = Tox        + 2x° 3
                                                                          —x

generates the sequence 1, 1,1,3,1,1,....
                  c) Finally, can we use the results of Example 9.5 to find a generating function for the
                      sequence 0, 2, 6, 12, 20, 30, 42,...?
                         Here we observe that

a =0=0 +0,                  a, =2=17 +1,
                                              a, =6= 2742,                a; = 12 = 3° 43,
                                              ag=20=
                                                 4° +4,....

In general, we have a, = n? +n, for eachn > 0.
                         Using the results from parts (c) and (d) of Example 9.5, we now find that

x(x + 1)       x      —xX(x+)i)4+x0~-x)                 9 2x
                                       (j—x* G-xp                       (-x>                  d-x)
                      is the generating function for the given sequence. (The solution here depends upon
                      our ability to recognize each a, as the sum of n? and n. If we do not see this, we may
                      be unable to answer the given question. Consequently, in Example 10.6 of the next
                      chapter, we shall examine another technique to help us recognize the formula for ay.)

For each n € Z*, the binomial theorem tells us that (1 + x)" = (8) + (7)x + (3)x?- +
                 wet (rx, We want to extend this idea to cases where (a) n < O and (b) n is not necessarily
                an integer.
422         Chapter 9 Generating Functions

With n,r € Z* andn > r > 0, we have
                                                (")        =        nt    _ na-)M@-—2)---a@-rt+)
                                                  r             rifn—r)}!            r!          ,
                              ifm € R, we use
                                                                    n(n -D(n—-2)---(a-rtD
                                                                                        r}
                              as the definition of (*).
                                  Then, for example, if n ¢ Z*, we have
                                                   (57) = Penn                         Den                        cane
                                                       7                                       yl
                                                                _         NY@OtDO+2)---@+tr-)
                                                                          |                   rh                          "
                                                                _Watr-                                  Ml) (nti
                                                                ~         aap                            v’(                  ,    ).
                              Finally, for each real number n, we define (3) = 1.

For n € Z*, the Maclaurin series expansion for (1 + x)~" is given by
      EXAMPLE 9.7
                                             (L+x)7"       = 1+ (—n)x + (—n)(—n — 1)x?/2!
                                                               + (—n)(—n — 1)(—n — 2)x3/314+.-.--
                                                           ~14      y         (—n)(—n — 1)(—n — 2) +--+ (-n-—rt                           De
                                                                                                             {
                                                                    r=]                                 rs

_ yea("               +r—     ‘)
                                                               r=                  r

Hence     (1+x)™” = (3) + (4")x + (B)x?$-5-=                                         HO, ({")x". This generalizes the
                                                                                                                      r

binomial theorem of Chapter 1 and shows us that (1 +.x)~”                                         is the generating function
                            for the sequence (%"), (4"). (3'), (3). ----

EXAMPLE 9.8 |         Find the coefficient of x* in (1 ~ 2x)~?.
                               With   y = —2x,     use the result in Example                  9.7 to write                    (1~2x)-7? = (1+ y)7=
                               0 ( )Y” = Lo                (G)(—2x)". Consequently, the coefficient of x° is (<’)(-2)5 =
                            (—1)9(7*37 ')(—32) = (32)(2) = 14,784.

| EXAMPLE 9.9               For each real number n, the Maclaurin series expansion for (1 + x)” is

1+nx +n(n —1)x?7/2!+n(n— 1)(n —2)x3/3!+---
                                                                                       =14          0 MG                                   arty
                                                                                                    I                              r! t
                                                     9.2 Definition and Examples: Calculational Techniques                   423

Therefore,

(—1/3)(—4/3)(—7/3) - + + (—3r + 2)/3)
                          (+3x)'F=14+ 5°                                                                        (3x y"
                                                                                    r!
                                               r=]

=14     3      (—1)(—4)(—7)             --- (—3r + 2) 0
                                                                           r
                                               r=]

and (1 + 3x)7'/> generates the sequence 1, —1, (—1)(—4)/2!, (-1)(—4)(-7)/3!, ...,
               (—1)(—4)(-7) -- + (-3r +2)/r},....

Determine the coefficient of x!       in f(x) = (x? +22 +2x44---)4,
EXAMPLE 9.10      Since     (x? 4x3 +244...)         =x2*(Ltx4+x74+---)                  = x7/(1— x), the coefficient of
               x! in f(x) is the coefficient of x!> in (x7/(1 — x))* = x8/(1 — x)*. Hence the coefficient
               sought is that of x’ in (1 — x)~*, namely, (3\(-b? = (-17(Ct7- '\(-1)7 = (?) = 120.
                  In general, for n € Z*, the coefficient of x” in f(x) is 0, when 0 <n <7. Foralln > 8,
               the coefficient of x” in f(x) is the coefficient of x"~° in (1 — x)~4, which is (,~') .
               (“8 = (3).
                  Before continuing, we collect the identities shown in Table 9.2 (on page 424) for future
               reference.

The next two examples show how generating functions can be applied to derive some
               of our earlier results.

In how many ways can we select, with repetitions allowed, r objects from n distinct objects?
EXAMPLE 9.11       For each of the n distinct objects, the geometric series 1 + x + x? + x? +--+ represents
               the possible choices for that object (namely none, one, two, . . .) . Considering all of the n
               distinct objects, the generating function is

f@=A4tx4+xr?
                                                     tx 4--5",
               and the required answer is the coefficient of x” in f(x). Now from identities 5 and 8 in
               Table 9.2 we have
                                                           1          \"          1            SL        (nti-l\     .
                       (ltx+x°+x°+---)
                                   2     3   ..eyf    =
                                                          ()               —
                                                                                Gx)"           )     (       ;   Js!     I

i=0
               so the coefficient of x” is
                                                             n+r-—1
                                                                  r             >

the result we found in Chapter 1.

Once again we consider the problem of counting the compositions of a positive integer
EXAMPLE 9.12
               n —this time using generating functions.
                  Start with

=xtx
                                                            tei tatte--
                                              l1-—x
424   Chapter 9 Generating Functions

Table 9.2

For allm,n ¢ Z*,aeR,

1 (+ x)" =) + Gxt Gx? +             + Gx”
                            2) (1 +.ax)” = (f) + (fax + G)atx? +--+ + (p)atx"
                            3)     +x")    = (7) + (t)x” ae (3)x?" meee          (")x"™

4) (L— x) /(l x) = Ltx tx? 4--- +x"
                            5) /(—-x)=ltxtx? taxi te.. = Px!
                            6) 1/(1 ~ ax) = 1+ (ax) + (ax)?
                                                        + x)? +.
                                             = Dylan! = DP at
                                             =i+axta*x*+a7xi+---
                             7) U/C +x)" = (G+ (Pat (Qe? +
                                             = dino (7)!
                                            =14 (DC 4H )x tae
                                                            ty                                atte
                                     = reo DIC; > Da!
                             8) 1/2) = (G+ (VT) 4+ GY) +
                                            = ES ACw
                                            =14+(-D(CtE Neat                     yr    t2    Yeap te
                                             a    OO     fr+i~dy
                                             =    2i=0      i    )x

If f(x) =     ey aix', g(x) =          S725 bx! and h(x) = f (x)g(x), then
                         h(x) = 5°72, c7x!, where for all k > 0,
                                                                                               k

CE = dgby + aybp_y +++ + ape)            + apbo = So ajby-;.
                                                                                              j=0

where, for example, the coefficient of x* is 1, for the one-summand composition of 4—
                       namely, 4. To obtain the number of compositions of n where there are two summands,
                       we need the coefficient of x" in(x +x? +45 +x4+.---)? =[x/( —x)P = x7/C1 — x).
                       Here, for instance, we obtain x* in (x + x7 +22         4+274-.-     -)” from the products x!   x3,
                       x? -x?, and x°-x!. So the coefficient of x*+ in x7/(1 — x)? is 3—for the three two-
                       summand compositions 1 + 3, 2 + 2, and3 + 1 (of 4). Continuing with the three-summand
                       compositions we now examine (x + x7 + x3 +x4+4---)? =[x/Q—- xP = 2° /            — xy.
                       Once again we look at the ways x* comes about — namely, from the products x! - x! - x,
                       x!.x?. x!) x2. x!     x! So here the coefficient of x* is 3, which accounts for the composi-
                       tions 1+1+2,1+2+4+1,and2+1+ 1                   (of 4). Finally, the coefficient of x4 in (x + x4
                       xetxt4...)4 = [x/(1—x)]* =x4/C — x)* is 1 — for the one four-summand compo-
                       sition 1 +1+1-+      1 (of 4).
                           The results in the previous paragraph tell us that the coefficient of x* in }°}_ | [x/(1 — x)]'
                       is 1+3+43-+41=8 (= 2°), the number of compositions of 4. In fact, this is also the
                       coefficient of x* in ye i[x/(1 — x)]'. Generalizing the situation we find that the number
                       of compositions
                              p        of a Pp positive integer
                                                            g   m is the coefficient of x” in the g generating & function
                                                       9.2 Definition and Examples: Calculational Techniques         425

fx) =          2%, [e/C — x)]!. But if we set y = x/(1 — x), it then follows that

ro By By (5)- (Dt)Lda
                      =x/(1 —2x) =x[1 + (2x) + (2x)? + (2x)? +- ++]
                                                                                                                 i-x

= 2                  4+ 2x4 eee,
                                 + 2!x27 42273

So the number of compositions of a positive integer n is the coefficient of x” in f(x) — and
               this is 2”~'      (as we found earlier in Examples     1.37, 3.11, and 4.12.)

EXAMPLE 9.13   Before we look at any specific compositions, let us start by examining identity 4 in Table
          .    9.2. When         x is replaced by 2 in this identity, the result tells us that for all n € Z*,         1+
               2427 4..-4+2" = (1 —2"*')/(1 — 2) = 2"*! — 1. [This result was also established by
               the Principle of Mathematical Induction    — in part (a) of Exercise 2 for Section 4.1.] All
               well and good— but where would one ever use such a formula? In Table 9.3 we find the
               special compositions of 6 and 7 that read the same left to right as right to left. These are the
               palindromes of6 and 7. We find that for7 there are 1 + (1 +244) =14+(14+2!'+27) =
               1 + (23 — 1) = 23 palindromes. There is one palindrome with one summand
                                                                                   — namely, 7.
               There is also one palindrome where the center summand is 5 and where we place the one
               composition of 1 on either side of this summand.

Table 9.3

1)                 6                 (1)      1)                   7                   (1)
                       2)             1+4+1                (1)        2)            14541                      (1)
                        3)            24242                 2)         3)           24342
                                                                                     +                         2)
                        4         141424141                           4)         14+14+34+141
                       5)             343                             5)            34143
                       6)          1424241                  ’         6)         142414241                     4)
                       7)          2+14+1+2                           7)         2+1+14+1+4+2
                        8)      14141414141                            8)   1414141414141

For the center summand 3 we place one of the two compositions of 2 on the right (of 3)
               and then match it on the left, with the same composition, in reverse order. This procedure
               provides the third and fourth palindromes of 7 in the table. Finally, when the center summand
               is 1, we put a given composition of 3 on the right of this 1 and match it on the left with the
               same composition, in reverse order. There are 27~' = 4 compositions of 3, so this procedure
               results in the last four palindromes of 7 in the table.
                  The situation is similar for the palindromes of 6 except for the case where, instead of 0
               as the center summand, a plus sign appears in the center. Here we obtain the last 2*~' = 4
               palindromes of 6 in the table — one for each composition of 3. Summarizing for n = 6 we
               have
                      i) Center summand 6                1 palindrome
                      ii) Center summand 4               1 (= 2!~') palindrome
                  iii) Center summand 2                  2 (= 2?-') palindromes
                  iv)        Plus sign at the center     4 (= 23!) palindromes
426         Chapter 9 Generating Functions

So there are 1 + (1 + 2! + 2?) = 1+ (23 — 1) = 2? palindromes for 6.
                                   Now we look at the general situation. For n = | there is one palindrome. If n = 2k + 1,
                             for k € Z*,     then there is one palindrome       with center   summand   n. For   1 <1 <k,    there
                             are 2'—' palindromes of n with center summand n — 2t. (One palindrome for each of the
                             2'-! compositions of f.) Hence the total number of palindromes of n is 1+ (1 +2!'+
                             274... 42k!) = 14 (2 — 1) = 2k = 2-9/2, Now consider n even, say n = 2k, for
                             k € Z*. Here there is also one palindrome with center summand n and, for 1 <s <k —1,
                             there are 2°~' palindromes of n with center summand n — 2s. (One palindrome for each
                             of the 2°~' compositions of s.) In addition, there are 2‘—! palindromes where a plus sign
                             is at the center. (One palindrome for each of the 2*—' compositions of k.) In total, n has
                             14+ (142) 42? 4...42%-* 4 2k) = 1 4 2" — 1) = 2* = 2"/2 palindromes.
                                The preceding results can be simplified. Observe that for n ¢ Z*,n               has 2!"/7! palin-
                             dromes.

Having dealt with compositions (once again) and palindromes, we continue at this point
                             with some additional examples dealing with generating functions.

In how many ways can a police captain distribute 24 rifle shells to four police officers so
      EXAMPLE 9.14
                             that each officer gets at least three shells, but not more than eight?
                                 The choices for the number of shells each officer receives are given by x° + x4-+ +--+ +
                             x®. There are four officers, so the resulting generating function is

FA) = OP Fah pet x8),
                                   We seek the coefficient of x74 in f(x). With (x? +4        4+---4+%8)4 = xP        tx ta? 4
                             2+)      x5) = x!?((1 — x®)/(1 — x))*, the answer is the coefficient of x? in (1 — x®)*.
                                   may t= [= (ah Ee? = Get 42] (6) + G9 + (Boa +h
                             which is [(73)(—1)"? = ()(e)(—D® + (6) C0)] = L02) — ()(@) + @)] = 125.

| _ EXAMPLE 9.15             Verify that for alln e Z*, 2") = 0", (")’.
                                Since (1 + x)” =[(1 +x)"}*, by comparison of coefficients (of like powers of x),
                             the coefficient of x" in (1+), which is (7"), must equal the coefficient of x” in
                                                                                     nh

[() + (a+ Gatto                + Gxt], and this is (5)() + (Gh) + Vita) ++
                             (7) (5). With (") = (,",), for all 0 <r <2, the result follows.
                                                    nr

1
                             Determine the coefficient of x® in
      EXAMPLE 9.16                                                 (x — 3)(x — 2)?”
                                   Since 1/(x —a) = (—1/a)(1/(1 ~ (x/a))) = (-1/a)[1 + (/a) + (/ay? +--+) for
                             any a # 0, we could solve this problem by finding the coefficient of x® in
                             1/[(x — 3)(x — 2)?] expressed as (—1/3)[1 + (x /3) + («/3)? +++ 10/4) [(G) +
                             (7°) (—x/2) + (G2) (-x/2)? + - + ).
                                   An alternative technique uses the partial fraction decomposition:
                                                            1                A      B       Cc
                                                                       =         +     +        ,
                                                     (x —3)(x-—2)?         x2x-3   2-2   (4-2)?
                             This decomposition implies that

1 = A(x — 2)? + B(x — 2)(x — 3)
                                                                                   + Cx — 3),
                                                   9.2 Definition and Examples: Calculational Techniques     427

Or

O-x?+ 0-x4+1=1=(A4+B)x?+(-44—-—S5B4+C)x
                                                    + (444+ 6B —3C).
                    By comparing coefficients (for x7, x, and 1, respectively), we find that A+ B = 0,
               —4A —5B+C =0,and4A + 6B — 3C = 1. Solving these equations yields A = 1, B =
               —1, and C = —1. Hence
                            1               1         1           |
                     (x—3)(x—-2)2         x-3      x-2      (x—2)

(250s Gate
                                       “(G)EG) *QLG)
                                         (QQ-OQa-A(a+-]
                 The coefficient of x® is (—1/3)(1/3)® + (1/2)(1/2)8 + (-1/4)(4)(-1/2)8 =
               — [(1/3)? + 701 /2)"°].

Use generating functions to determine how many four-element subsets of § = {1, 2,3,...,
EXAMPLE 9.17
               15} contain no consecutive integers.

a) Consider one such subset (say {1, 3, 7, 10}), and write 1 <1<3<7<10<                15. We
                       see that this set of inequalities determines the differences 1 — 1 = 0,3 —1=2,7—-
                       3 =4,10—7 =3, and 15 — 10 =5, and these differences sum to 14. Considering
                       another such subset   — say (2, 5, 11, 15}, we write 1 <2 <5 <11< 15 < 15; these
                       inequalities yield the differences 1, 3, 6, 4, and 0, which also sum to 14.
                          Turning things around, we find that the nonnegative integers 0, 2, 3, 2, and 7 sum to
                       14 and they are the differences that arise from the inequalities 1<1<3<6<8<15
                       (for the subset {1, 3, 6, 8}).
                          These examples suggest a one-to-one correspondence between the four-element
                       subsets to be counted and the integer solutions to c) + cz + c3 + ¢4 +s = 14 where
                       0<c),    ¢5, and 2 < c, c3, cy. (Note: Here c, c3, c4 > 2 guarantee that there are no
                       consecutive integers in the subset.) The answer is the coefficient of x!4 in

f(x)=A4x4x74x34--)@?
                                     tro tatg- Pd txt xr trt--.
                                = x®(1—x)>.
                       This then is the coefficient of x* in (1 — x)~>, which is (3)(-1)® = orn y=
                       (2)
                        = 495.
                    b) Another way to look at the problem is as follows.
                          For the subset {1, 3, 7, 10}, we examine the strict inequalities O0< 1<3<7<
                       10 < 16 and consider how many integers there are strictly between each successive
                       pair of these numbers. Here we get 0, 1, 3, 2, and 5: 0 because there is no integer
                       between 0 and 1, 1 for the integer 2 between 1 and 3, 3 for the integers 4, 5, 6 between
                       3 and 7, and so on. These five integers sum to 11. When we do the same thing for the
                       subset {2, 5, 11, 15}, the strict inequalities 0 < 2 <5 < 11 < 15 < 16 yield the results
                       1, 2, 5, 3, and 0, which also sum to 11.
428         Chapter 9 Generating Functions

On the other hand, we find that the nonnegative integers 0, 1, 2, 1, and 7 add up to
                                   11 and they arise as the numbers of distinct integers between the integers in the five
                                   successive strict inequalities 0 < 1 <3 <6 < 8 < 16. These correspond to the subset
                                   {1, 3, 6, 8}.
                                       These results suggest a one-to-one correspondence between the desired subsets
                                   and the integer solutions to 5; + by + b3 + by + bs = 11, where 0 < by, bs and | <
                                   bz, b3, by. (Note: In this case, b), b2, b3 > 1 guarantee that there are no consecutive
                                   integers in the subset.) The number of these solutions is the coefficient of x!! in

g(x) =      txtxrt-. )(Qxtar2ta3 te BU tx tx?
                                                                                          4+ ---)
                                                    = x3(1—x)7>"

The answer is (y) (—1)8        = 495, as above. (The reader may now wish to look back
                                   at Supplementary Exercise 15 in Chapter 3.)

Our next example takes us back to the optional material in Chapter 3 where we first
                             encountered the idea of the sample space. But now that we know about generating functions
                             we will be able to deal with a sample space that is discrete but ot finite — that is, a countably
                             infinite’ sample space.

a) Suppose that Brianna takes an actuarial examination until she passes it. Further, sup-
      EXAMPLE 9.18"
                                   pose the probability that Brianna passes the examination on any given attempt is 0.8
                                   and that the result of each attempt, after the first, is independent of any previous at-
                                   tempt. If we let P denote “pass” and F denote “‘fail”, for any given attempt, then here our
                                   sample space may be expressed as & = {P, FP, FFP, FFFP,                   . . .}, where, for example,
                                   Pr (FFP) — the probability Brianna fails the exam twice before she passes it — is given
                                   by (0.2)? (0.8). In addition, the sum of the probabilities for the outcomes in & is (0.8) +
                                   (0.2)(0.8) + (0.2)?(0.8) + (0.2)3(0.8) ++» + = 3°2,(0.2)'(0.8) = (0.8) 5°72, (0.2)!
                                   = (0.8) (5)          = (0.8) (53) = 1,asitshould be — for according to the second axiom
                                   of probability (in Section 3.5) we expect Pr(f) = 1. [Note that  ey (0.2)! = on
                                   follows from the result in part (b) of Example 9.5. The given geometric series con-
                                   verges to ~—p5 because |0.2] < 1.]
                               b) Now suppose we want to know the probability Brianna passes the exam on an even-
                                  numbered attempt. That is, we want Pr(A) where A is the event {FP, FFFP. . .}.
                                     At this point let us introduce the discrete random variable Y where Y counts the num-
                                  ber of attempts up to and including the one where Brianna passes the exam. Then the
                                  probability distribution for Y is given by Pr(Y = y) = (0.2)-'(0.8), y > 1. So Pr(A)
                                  can be determined as follows: Pr(A) =            SI Pr(Y = 2i) =     v1 (0.2)7!(0.8) =
                                   (0.8) D072 (0.2)! = 0.8[(0.2) + (0.2)? + (0.2) ++ - -] = (0.8)(0.2)[1 + (0.2)? +
                                   (0.2)* +--+] = (0.8)(0.2) — oD? = ONO)                   — :. And once again we have used the
                                   result in part (b) of Example
                                                            9.5, this time withx = (0.2)*, where |(0.2)?| = |0.04| < 1.

"The reader can learn more about countably infinite sets from the material in Appendix 3.
                                 * This example uses material from the optional sections of Chapter 3. It may be skipped without any loss of
                             continuity.
                                                       9.2 Definition and Examples: Calculational Techniques                         429

c) Continuing with Y, now we’d like to find E(Y), the number of times Brianna expects
      to take the actuarial exam before she passes it. To determine                                              £(Y)    we'll start with
      the formula 1/(1 —#) =1+1r+1°+13+.--- and go one step further. Taking the
      derivative of both sides, we find [as in Example 9.5(c)] that

_pa—p2-y-
            (—1)(1
                 —t)-*(-1) —L_
                           Gop? = #/_1_
                                  a  lion |=                                                        1+2t4+3f°+4t+      Papa.
      where this series likewise converges’ for |t| < 1. Therefore,

E(Y) =) yPr(¥ = y) = )) 90.271 0.8)
                               y=                                 y =]

= (0.8) } > y(0.2)?! = (0.8)[1 + 2(0.2) + 3(0.2)° + 40.2)? +--+]
                             y=)
                                                         1    5
                     = (0.8)

So Brianna expects to take the exam 1.25 times before she passes it.
  d) Finally, to determine Var(Y) we first want to find E(Y7). To do so we first multiply
     the result in part (c) by ¢ and find [as in Example 9.5(c)] that

Ton             =r42P 43444...

Differentiating both sides of this equation now gives us
                     (=1)°)-1@a—-1H(-1l)_                                                    ltt    _ dd           t
                                               (1 —t)4                                  d—t)                dt|ad-—rt?
                                                                                   =? 42774344
                                                                                             4---,
      and this series is also convergent* for |r| < 1. So now we have
                          oo

E(Y) = 0 y°Pr¥ = y) = >° y?0.2)" 10.8)
                      y=l                                        y=l
                 = (0.8) }> y°(0.2)"! = O.8)f1? + 270.2) + 370.2)? + 70.29 +++]

-00 [3295]
                                      y=l

140.2                      1.2            15
                                         (1—0.2)3}                  (0.8)               8°

‘Using the Ratio Test from calculus, one finds that

(n+ 1)t"              _  Atd                         .          1
                      lim
                                       nt?!
                                                    = |t) lim ——                 =[t|    lim (14    — } = Ie](1)
                                                                                                              = I].
                     A> OC                                 NOOO              n          ASO         n

When ¢ = +1, limy+o nt”!                    # 0, so the series does not converge for f = + 1. Consequently, this infinite series
converges for |¢| < 1.
   *Once again we use the Ratio Test from calculus. Here

limOO
                 A+
                               (n + 1)72”
                                    n2zpr-!       it fim, OP 1)? =
                                                         n> OO           n               R00
                                                                                             tim (1+2)1\? = ey? =i
                                                                                                        a

When t = +1, limp.             n*t"~!         #0, so the series does not converge for? = + 1, Consequently, this infinite series
converges for |¢| < 1.
430         Chapter 9 Generating Functions

Consequently,
                                                                                                2

Var(¥) = E(Y?) —{E(Y)P = ; - (3) =

The preceding example introduced us to a new discrete random variable — namely, the
                             geometric random variable. In this situation we perform a Bernoulli trial until we are
                             successful (for the first time). As with the binomial random variable the outcome of each
                             trial, after the first, is independent of the outcome for any previous trial. Further, the proba-
                             bility of success for each Bernoulli trial is p, and the probability of failure is g = 1 — p.
                                  If we let the random variable Y count the number of trials until we are finally successful,
                             then Y is a discrete random variable with probability distribution given by

Pr(¥=y)=q
                                                                  |p,               y=1,2,3,....

In addition, we find that

E(Y) = -
                                                                    |       and      Var(Y) = —.
                                                                                                    q
                                                                 p                            Pp
                             The following example uses the last identity in Table 9.2. (This identity was used earlier in
                             Examples 9.14 and 9.15 — but rather implicitly.)

Let f(x) =x/(1 — x)’. This is the generating function for the sequence ag, a1, d2,...,
      EXAMPLE 9.19
                             where a; = k for all k € N. The function g(x) = x(x + 1)/(1 — x)? generates the sequence
                             bo, bj, bz, ..., forby = k?, KEN.
                                The function h(x) = f(x)g(x) consequently gives us agbo + (agb, + ayby)x +
                             (dob + a,b, + azby)x* +-:++,80 h(x) is the generating function for the sequence cy, ¢),
                             c2,..., where foreach k EN,

Ce = andy + ay by    + agbg-2 +++ + + Gg_2b2 + ag_1b) + abo.

Here, for example, we find that

co = 0-07 =0
                                                          cy =0-1°4+1-0?
                                                                       =0
                                                          o=0-2?
                                                            41-12 +2-0 =1
                                                          c= 0-3°41-2742-17
                                                                       43-0 =6
                             and, in general, c, = }°*_, i(k —i)*. (We shall simplify this summation formula in the
                             Section Exercises.)
                                Whenever     a sequence    cy, C1, C2,...   arises from   two   generating   functions   f(x)   [for
                             ay, 4, a2, ...Jand g(x) [for bo, b;, bz, . . .], as in this example, the sequence co, ¢1, C2, ...
                             is called the convolution of the sequences ay, a), a2, ... and by, bi, bo, ....

Our last example provides one more instance of the convolution of sequences.

+--+ and g(x)=1/(l+x) =1-x4+2x°-
                             For f(x) =1/(d~—x)=lt+xtx74+x3
      EXAMPLE 9.20
                                 we.--,
                             x>+.-  find that

fxg)         = 1/[d —- At x)=           1/d-x?) =14t               xo
                                                                                                    x2 4x44              4...
                                                                    9.2     Definition and Examples: Calculational Techniques        431

Consequently, the sequence 1, 0, 1, 0, 1, 0, . . . is the convolution of the sequences 1, 1, 1,
                               1,1,1,...and1,—-1,1,—-l,1,—1,....

8. Forn € Z*, find in (1 + x + x7)(1 + x)" the coefficient of
                          EXERCISES 9.2                               (a) x’; (b) x®; and (c) x” forO <r <n+2,reZ.
  1. Find generating functions for the following sequences.               9. Find the coefficient of x'> in each of the following.
[For example, in the case of the sequence 0, 1, 3, 9, 27,...,                a) x3(i — 2x)!
the answer required is x/(1 — 3x), not }°™,, 3'x'*! or simply                b) (3 — 5x)/(1 — x)?
O+x4+3x?
      + 9x3 4---,]
                                                                             c) (1 +x)*/      — x)
    a) (0). (3). @). ++.G)                                           10. In how many ways can two dozen identical robots be as-
    b) (i), 2), 3G). ---. 8G)                                        signed to four assembly lines with (a) at least three robots as-
    ec)   1,-1,1,-1,1,—-1,...                                        signed to each line? (b) at least three, but no more than nine,
    d) 0, 0, 0, 6, —6, 6, —6, 6,...                                  robots assigned to each line?

e) 1,0,1,0,1,0,1,...                                              11. Inhow many ways can 3000 identical envelopes be divided,
    f) 0,0, 1,a,a*,a,...,a
                        40
                                                                     in packages of 25, among four student groups so that each group
                                                                     gets at least 150, but not more than 1000, of the envelopes?
2. Determine the sequence generated by each of the following
generating functions.                                                12. Two cases of soft drinks, 24 bottles of one type and 24 of an-
                                                                     other, are distributed among five surveyors who are conducting
    a) f(x) = Qx — 3)              b) f(x) =x4/0 — x)                taste tests. In how many ways can the 48 bottles be distributed
    ce) f(x) =x°/ — x’)            d) f(x) = 1/(. + 3x)              so that each surveyor gets (a) at least two bottles of each type?
    e) f(x) = 1/3 —-~x)                                              (b) at least two bottles of one particular type and at least three
    f) f(x) = 1/0 — x) + 3x? - 11                                    of the other?

3. In each of the following, the function f(x) is the generating    13. If a fair die is rolled 12 times, what is the probability that
function for the sequence ap, a), a2, ..., whereas the sequence      the sum of the rolls is 30?
by, 6), bx, ... is generated by the function g(x). Express g(x)      14. Carol is collecting money from her cousins to have a party
in terms of f(x).                                                    for her aunt. If eight of the cousins promise to give $2, $3, $4,
    a)    b,   = 3                                                   or $5 each, and two others each give $5 or $10, what is the
          by, =G,,n
                 EN, n #3                                            probability that Carol will collect exactly $40?
    b)    b;   =                                                     15. In how many ways can Traci select n marbles from a large
          bo   =7                                                    supply of blue, red, and yellow marbles (all of the same size) if
          b, =a,,n
               EN, n #3,7                                            the selection must include an even number of blue ones?
    c)    b)   = 1                                                   16. How can Mary split up 12 hamburgers and 16 hot dogs
          b,   =3                                                    among her sons Richard, Peter, Christopher, and James in such
          b, = 2a,,nEN,n
                      41,3                                           a way that James gets at least one hamburger and three hot dogs,
   d)     b,) = 1
                                                                     and each of his brothers gets at least two hamburgers but at most
                                                                     five hot dogs?
          b,   =3
          by   =7                                                    17. Verify that(1 — x — x? — x3 ~ x4 — x5 — x°)~! is the gen-
          b, = 2a, +5,n€N,n#1,3,7                                    erating function for the number of ways the sum, where n €N,
4. Determine the constant (that is, the coefficient of x°) in       can be obtained when a single die is rolled an arbitrary number
(3x? — (2/x)).                                                       of times.

5. a) Find the coefficient of x’ in                                 18. Show that (1 — 4x)~'/? generates the sequence (*"), n € N.
                        (txtx74x3      4...                          19. a) If a computer generates a random composition of 8,
                                                                         what is the probability the composition is a palindrome?
    b) Find the coefficient of x’ in
                                                                            b) Answer the question in part (a) after replacing 8 by n, a
                     (tx+x?+x°+--.)"forn eZ.                                fixed positive integer.
6. Find the coefficient of x°° in (x7 + x8 +x? +---)®.              20. a) How many palindromes of 11 start with 1? with 2? with
7, Find the coefficient of x7? in (x? + x3 +444 x54 x°P.                    3? with 4?
432            Chapter 9 Generating Functions

b) How many palindromes of 12 start with 1? with 2? with                 c) Find the subset of S$ that determines the differences
      3? with 4?                                                              a, b,c, d, and e, where 0 < a, e, and2 < b, c, d.
21. Let n be a (fixed) positive integer, with n > 2. If 1<t<              30. In how many ways can we select seven nonconsecutive
[n/2|, how many palindromes of n start with 7?                            integers from {1, 2, 3,..., SO}?
22. Let n € Z*, n odd. Can a palindrome of n have an even
                                                                          31. Use the following summation formulas to simplify the ex-
number of summands?
                                                                          pression for c, in Example 9.19:
23. Letn € Z*, n even. How many palindromes of n have an
even number of summands? How many have an odd number of                                                                    k(k+ 1)
                                                                                                                                2       4

summands?
24. Determine the number of palindromes of n, where all sum-                        k                k
                                                                                         k(k + 1)(2k + 1
mands are even, for (a) n = 10; (b) nm = 12; and (c) n even.                       ies pre-e                                                  and
25. Shay rolls a fair die until she gets a 6. If the random vari-
                                                                                                 k                :         2
able Y counts the number of times Shay rolls the die until she
                                                                                                 yr=ypr=Seey
gets her first 6, determine (a) the probability distribution for Y;
(b) E(Y); and (c) ay.
                                                                         32. a) Find the first four terms cy, c), C2, and c3 of the convo-
26. Referring back to the preceding exercise, what is the prob-              lutions for each of the following pairs of sequences.
ability Shay rolls her first 6 on an even-numbered roll?
                                                                                   i)   a, = 1, b, = 1, forallneN
27. Leroy has a biased coin where Pr(H) = 2 and Pr(T) = &.                        li)   a, = 1, b, = 2", foralln eN
Assuming that each toss, after the first, is independent of any                  iii)   dg   =   4)      =   4,   =a3=1;        a,   = O,   neNn,
previous outcome, if Leroy tosses the coin until he gets a tail,
                                                                                        n#0,1,           2,3;     5, = 1, forallaeN
what is the probability he tosses it an odd number of times?
                                                                              b) Find a general formula for c,, in each of the results of
28. If Y is a geometric random variable with E(Y) = i,
                                                                              part (a).
determine (a) Pr(Y = 3); (b) Pr(Y > 3); (c) Pr(¥ > 5);
(d) Pr(¥ > 5|Y > 3); (e) Pr(Y > 6|¥ > 4); and (f) oy.                    33. Find a formula for the convolution of each of the following
                                                                         pairs of sequences.
29. Consider part (a) of Example 9.17.
                                                                              a) a, =1,0<n<4,a,                       =0,foralln>5;
      a) Determine the differences for the inequalities that re-
                                                                              b, =n, forallneN
      sult from the subset {3, 6, 8, 15} of S, and verify that those
      differences add to the correct sum.                                     b) a, = (—1)", b, = (—1)", for alln EN
      b) Find the subset of S$ that determines the differences 2,
      2, 3, 7, and 0.

9.3
                   Partitions of Integers
                                 In number theory, we are confronted with partitioning a positive integer 1 into positive
                                 summands and seeking the number of such partitions, without regard to order. This number
                                 is denoted by p(n). For example,

pQjy=1:        1
                                                   p(2)=2:       2=1+4+1
                                                   pB3)=3:       3=2+1=14+1+4+1
                                                   p4=5:         4=341=242=24141=141414+1
                                                   pS) =7:       5=44+1=342=34141=24241
                                                                       =2414+141=1414+14+141
                                    We should like to obtain p() for a given n without having to list all the partitions. We
                                 need a tool to keep track of the numbers of 1’s, 2’s, ..., ’s that are used as summands
                                 for n.
                                                                                      9.3 Partitions of Integers      433

Ifn € Z*, the number of 1’s we can use is 0 or 1 or 2 or... . The power series 1 + x +
               x? +x3 +x4+4.-.- keeps account of this for us. In like manner, 1 +x? +x4+x°+--.
               keeps track of the number of 2’s in the partition of n, while 1 +2%7+x°+x°+---
               accounts for the number            of 3’s. Therefore, in order to determine            p(10), for instance,
               we want the coefficient of x!° in f(x) =(1t+x+a7 4239+) Jd 4x7 427494
                SL         txt $ x84 x9 $e. ee Lt x 4574. oring(x)                                 = (Ltx4+x7 4374+
               se     xl)        Hartt.              +x!         txF $204 x7). + (4+x"%).
                    We prefer to work with f(x), because it can be written in the more compact form

1            1           1         -TI          l
                                 f(x)=
                                            d-xyd-x)d—-x3)                  d—x)            ea         — x!)
               If this product is extended beyondi = 10, we get P(x) = H.u/a                          — x')], which gener-
               ates the sequence p(0), p(1), p(2), p(3),...,              where we define p(0) = 1.
                  Unfortunately, it is impossible to actually calculate the infinite number of terms in the
               product P(x). If we consider only [];-,[1/(1 — x‘)] for some fixed r, then the coefficient
               of x” here is the number of partitions of n into summands that do not exceed r,
                  Despite the difficulty in calculating p(n) from P(x) for large values of n, the idea of the
               generating function will be useful in studying certain kinds of partitions.

Find the generating function for the number of ways an advertising agent can purchase n
EXAMPLE 9.21   minutes (n € Z*) of air time if time slots for commercials come in blocks of 30, 60, or 120
               seconds.
                   Let 30 seconds represent one time unit. Then the answer is the number of integer solutions
               to the equation a + 2b + 4c = 2n withO <a, b,c.
                   The associated generating function is

Fx)        (ht xtx?te- JG txt tate                        tat tah +--+)
                                      (I

1    1    1
                                           l—-x   l—x?       1-—x4’

and the coefficient of x2" is the number of partitions of 2n into 1’s, 2’s, and 4’s, the answer
               to the problem.

Find the generating function for p,(n), the number of partitions of a positive integer n into
EXAMPLE 9.22
               distinct summands.
                    Before we start, let us consider the 11 partitions of 6:

I     1+14+14+14+141                                  2) 141414142
                      3)    14+14+1+3                                       4, 1+1+4
                      5)    1+14+242                                        6) 1+5
                      7)    14+243                                          8) 2+2+4+2
                      9) 2+4                                              10)   343
                    11)     6
                    Partitions (6), (7), (9), and (11) have distinct summands, so py(6) = 4.
434         Chapter 9 Generating Functions

In calculating pg(n), for each k € Z* there are two choices: Either k is not used as one
                             of the summands of », or it is. This can be accounted for by the polynomial 1 + x*, and
                             consequently, the generating function for these partitions is

8
                                                   Pye) = (L4x)1 4x20 4+2x7)---=                                       [Jd 42°).
                                                                                                           i       l

i
                                For each n € Z*, p(n) is the coefficient of x” in (1 + x)(1 +x?)---(1 +x"). [We
                             define p,(0) = 1.] When n = 6, the coefficient of x° in (1 + x)(1 + x2)-+- (1 + x9) is 4,

Considering the partitions in Example 9.22, we see that there are four partitions of 6 into odd
      EXAMPLE 9.23
                             summands: namely, (1), (3), (6), and (10). We also have p,(6) = 4. Is this a coincidence?
                                Let p,(n) denote the number of partitions of n into odd summands, whenn > 1. We define
                             Po(0) = 1. The generating function for the sequence p,(0), po(1), po(2), ... is given by
                                    Poxy=(1tx4x?txPt--
                                            JE xe tx te. t¢ x5 +x 4+...).
                                                                            1           1          1                   ]
                                                (alte               =              l—-x3}—x5                   Joy?
                             Now because
                                                 1 — x?                         1—x4                                       1 —xé
                                      l+x=       Tox’          l+x°=               5,          L+x =                          ,,      Lees

we have
                                      Pax) = (tx)          +x)          42°) 4+ x4) ---
                                               wintiestiesies
                                                l~x l1-~x? 1-x3 1—x4
                                                                     oo)I~x1l—x3
                                                                            1g                                                               owe
                                From the equality of the generating functions, py(n) = p,(n), for all n > 0.

Once again we shall permit only odd summands, but in this example each such (odd)
      EXAMPLE 9.24
                            summand must occur an odd number of times — or not at all. Here, for example, there is
                            one such partition of the integer 1— namely, 1— but there are no such partitions of the
                            integer 2. For the integer 3 we have two of these partitions: 3 and 1 + 1 + 1. When we
                            examine the possibilities for the integer 4, we find the one partition 3 + 1.
                               The generating function for the partitions described here is given by

f@)= (tx 40404                    - JF          ex? $xP $-- JU F 0 4x5 4x5 4..)--.
                                      OO          oO
                                  _ I]        1+ So    PH DeHD
                                      k=0        i=0
                            The g generating g function is not g givenb y
                                            (xtrPHrePte-jortxP              tub        $e.   jad       t   xb          4x5     4...)...            (x)

If it were, then the product could not contain any terms where x would appear to a finite
                            power. The situation given by equation (*) would occur if we were to believe that every
                            odd positive integer must appear as a summand at least once. And in such a “partition” the
                            number of summands and the sum itself would both be infinite. Consequently, whether or
                            not it is stated, we must realize that each odd summand may not appear at all — and this
                            condition is accounted for by the (first) summand, 1 = x, that appears in each factor of
                                                                                                          9.3. Partitions of Integers   435

f (x). In fact, for all but a finite number of odd summands, this is the case. Of course, when
                                 an odd summand does appear in a partition, it does so an odd number of times.
                                                                                      7

We close this section with an idea called the Ferrers graph. This graph uses rows of dots
                                 to represent a partition of an integer where the number of dots per row does not increase as
                                 we go from any row to the one below it.
                                     In Fig. 9.2 we find the Ferrers graphs for two partitions of 14:(a)4+3+3+2+1+41
                                 and (b) 6+ 4+ 3+ 1. The graph in part (b) is said to be the transposition of the graph in
                                 part (a), and vice versa, because one graph can be obtained from the other by interchanging
                                 rows and columns.

e     e    e@      e        @   ®   e   e      e     @

oe    e@   e                e   e   e   e

e     e    e                e   e   e

e     ®                     e
                                                                                                                 (b)
                                                              e

°                (a)
                                                          Figure 9.2

These graphs often suggest results about partitions. Here we see a partition of 14 into
                                 summands, where 4 is the largest summand, and a second partition of 14 into exactly
                                 four summands. There is a one-to-one correspondence between a Ferrers graph and its
                                 transposition, so this example demonstrates a particular instance of the general result: The
                                 number of partitions of an integer n into 7 summands is equal to the number of partitions
                                 of n into summands where m is the largest summand.

6. What is the generating function for the number of partitions
                          EXERCISES 9.3                                   of n € N into summands that (a) cannot occur more than five
                                                                          times; and (b) cannot exceed 12 and cannot occur more than
1. Find all partitions of 7.
                                                                          five times?
2. Determine the generating function for the sequence ag, «1,
                                                                            7. Show that the number of partitions of a positive integer n
a, ..., where a, is the number of partitions of the nonnegative
                                                                          where no summand appears more than twice equals the number
integer n into (a) even summands; (b) distinct even summands;
                                                                          of partitions of nm where no summand is divisible by 3.
and (c) distinct odd summands.
                                                                              8. Show that the number of partitions of n € Z* where no
  3. In f(x) = [1/1 — x) 1/0 — x?)][1/d1 — x3)], the coef-
                                                                          summand is divisible by 4 equals the number of partitions of n
ficient of x° is 7. Interpret this result in terms of partitions
                                                                          where no even summand is repeated (although odd summands
of 6.
                                                                          may or may not be repeated).
4. Find the generating function for the number of integer so-
                                                                            9, Using a Ferrers graph, show that the number of partitions
lutions of
                                                                          of an integer n into summands not exceeding m is equal to the
    a) 2w+3x+5y+7z=n,                 O<w,x,y,z                           number of partitions of n into at most m summands.
    b) 2w+3x+5y+7z2
                 =n,                 O<w,      4<x,y,      5x2            10. Using a Ferrers graph, show that the number of partitions
5. Find the generating function for the number of partitions             of n is equal to the number of partitions of 2 into n summands.
of the nonnegative   integer n into summands     where   (a) each
summand must appear an even number of times; and (b) each
summand   must be even.
436           Chapter 9 Generating Functions

9.4
      The Exponential Generating Function
                              The type of generating function we have been dealing with is often referred to as the ordinary
                              generating function for a given sequence. This function arose in selection problems, where
                              order was not relevant. However, turning now to problems of arrangement, where order is
                              crucial, we seek a comparable tool. To find such a tool, we return to the binomial theorem.
                                  For each n € Zt, (1 +x)" = (9) + ({)x + (G)x? +--+ + (2)x", so (1 + x)" is the (or-
                               dinary) generating function for the sequence (5), (7), (3), ..., ("), 0, 0, .... When dealing
                               with this idea in Chapter 1, we also wrote (”) = C(n, r) when we wanted to emphasize that
                               (”) represented the number of combinations of n objects taken r at a time, with O <r <n.
                              Consequently,      (1 + x)” generates the sequence C(n, 0), C(n, 1), C(n, 2),..., C(n,n),
                              0,0,....
                                  Now for all O <r <n,

n}              1
                                                              C(n,r) = rin =r)! = (;;)                  P(n,r),

where P(n, r) denotes the number of permutations of n objects taken r at a time. So

(1 +x)" = C(n, 0) + Ca, 1)x + C(n, 2)x? + C(n, 3)x8 +--+ C(n, n)x”
                                                                                         x?                 ras                           x”
                                                = P(n, 0) + P(n, 1)x + Pla, YF                + P(n,       ry          +---4+ P(n, ny

Hence, ifin      (1+ x)”   we    consider   the coefficient     of x’/r!,       with          0<r<n,   we        obtain
                               P(n, r). On the basis of this observation, we have the following definition.

Definition 9.2         For a sequence dp, a), a2, 43, .. . of real numbers,
                                                                               x2   x3                            xo      yi
                                                    F(X) = ag + ax        + a2~ +03>- +-°-=                   )         as,
                                                                              2!   3!                                      i!
                                                                                                             i=0

is called the exponential generating function for the given sequence.

Examining the Maclaurin series expansion for e*, we find
      EXAMPLE 9.25
                                                                          x?        x3   x4                 Sx!
                                                         Msltxty+y+gt
                                                         x=               —_        —    —         ee
                                                                                                           LT          —_

so e* is the exponential generating function for the sequence 1, 1, 1, ... . (The function e*
                              is the ordinary generating function for the sequence 1, 1, 1/2!, 1/3!, 1/4!,....)

Our next example shows how this idea can help us count certain types of arrangements.

EXAMPLE 9.26 |          In how many ways can four of the letters in ENGINE be arranged?
                                 In Table 9.4 we list the possible selections of size 4 from the letters E, N, G, I, N, E,
                              along with the number of arrangements those four letters determine.
                                 We now obtain the answer by means of an exponential generating function. For the
                              letter E we use [1 +x      + (2°? /2!')] because there are 0, 1, or 2 E’s to arrange. Note that
                              the coefficient of x*/2! is 1, the number of distinct ways to arrange (only) two E’s. In like
                                                                                   9.4 The Exponential Generating Function                                       437

Table 9.4

EEN             N                At /(2! 2!)              E        GUNN                                4! /2!
                                    EEGHN                            At /2!                   E        I NN                                4! /2!
                                    EE         IN                    4!/2!                    G        I       N         N                 4! /2!
                                    EEG             I                4t/2!                    E        I       GN                          4!

manner, we have [1 + x + (x?/2!)] for the arrangements of 0, 1, or 2 N’s. The arrangements
               for each of the letters G and I are represented by (1 + x).
                   Consequently, we find here that the exponential generating function is

f(x) =[L4+x 407/297 +2),
               and we claim that the required answer is the coefficient of x*/4lin f(x).
                  In order to motivate our claim, let us consider two of the eight ways in which the term
               x1 /4! arises in the expansion of
                                     fx) =[L4x4¢ 7/29] 4x + 7/2910 +x) +).
                    1) From the product (x?/2!)(x?/2!)(1)(1), where (x7/2!) is taken from each of the
                         first two factors (namely, [1 + x + (x?/2!)]) and 1 is taken from each of the last two
                         factors [namely, (1 +.x)]. Then (x7/2!)(x?/2)(1)(1) = x4/(2! 2!) = (41/(2! 2)-
                         (x*/4!), and the coefficient of x*/4! is 4!/(2! 2!) the number of ways one can
                         arrange the four letters E, E, N, N.
                    2) From the product (x?/2')(1)(x)(x), where                                   (x?/2!) is taken from the first factor
                         (namely,    [1 + x + (x*/2!)]),                  1 is taken      from         the     second                 factor    (again,      [I] + x +
                         (x?/2!)]), and x is taken from each of the last two factors [namely, (1 + x)]. Here
                         (x? /2!)(1)(x) (x) = x4/2! = (4!/2!)(x4/4)), so the coefficient of x*/4! is 41/2! — the
                         number of ways the four letters E, E, G, I can be arranged.

In the complete expansion of f (x), the term involving x‘ [and, consequently, x*/4!] is
                                 a             a
                    x4       x                                                            4

mata         tata
                                   ta ta tay t*

-[(s) +G)+(@) +) (@)+G)*@) J):
               where the coefficient of x*+/4! is the answer (102 arrangements) produced by the eight
               results in the table.

Consider the Maclaurin series expansions of e* and e™*. xX
EXAMPLE 9.27
                                               2            3         4                                                           2        3             4
                          alexa                    4424...                                e*=l-x+—-—4+—~-:::
                                          2)        3!                                                                       2!           3!        4!
                    Adding these series together, we find that

x? x4
                                                    e+e                   -2(1+5 45+                                 ).
                                                        x        —x       _     _— _                           tae

or
                                                                e+e           —   pax yh                   a
                                                                 2                   2!           At                 ‘
438         Chapter 9 Generating Functions

Subtracting e* from e* yields

x        —Xx    _            x?            x?       1
                                                                                          RH                    tate
                            These results now help us in the following.

A ship carries 48 flags, 12 each of the colors red, white, blue, and black. Twelve of these
      EXAMPLE 9.28
                            flags are placed on a vertical pole in order to communicate a si gnal to other ships.
                               a) How many of these signals use an even number of blue flags and an odd number of
                                  black flags?
                                     The exponential generating function

x?        x3                                x?            x4                            x3         x?
                                   foy=             (14945454...)                                     (+5+                    54.)               (+5454)

considers all such signals made up of n flags, where n > 1. The last two factors in
                                  f(x) restrict the signals to an even number of blue flags and an odd number of black
                                  flags, respectively.
                                      Since

f(x) = (ey? (=)                              (<—)                     7 @                (e*)(e* — e7%) = ite                         —1)

_1f 4x                       _(1\ S (4x)!
                                                “(>   i!                  -)=(2)¥i=] it!
                                  the coefficient of x!*/12! in f(x) yields (1/4)(4!2) = 4" signals made up of 12 flags
                                 with an even number of blue flags and an odd number of black flags.
                              b) How many of the signals have at least three white flags or no white flags at all? In this
                                 situation we use the exponential generating function

x2        x3                           x3 x4                                             x2        x3              2
                              ca          (leer                be        ye        \(is               beta)                            (eet                      ete)

2                                                    2
                                      =    et    G          —~yY—   5)    (e*)?    —     e*       G    ~x~            =)           _   ett   ~   xe**   _        (5)    x2 e3*

“EES     = (4x)!               = (3x)!
                                                                                   (S)(Ee)    ?         = (3x)!
                                                                                                        =0

Here the factor (1 + x + x +.) =er%—xX- x in g(x) restricts the signals to
                                 those that contain three or more of the 12 white flags, or none at all. The answer for
                                 the number of signals sought here is the coefficient of x!7/12! in g(x). As we consider
                                 each summand (involving an infinite summation), we find:
                                           X        (4x)!                                                   1
                                 i)         > \ i”           — Here we have the term “2- = 4!?(=),
                                                                                              x2
                                                                                                  so the coefficient of x!2/12!
                                                                                              12!
                                          i=0
                                          is 4!?.
                                                                                                  9.4 The Exponential Generating Function                            439

ii) x (>:      ( )     )—ow        we see that in order to get x'7/12! we need to consider the
                                                      i=o |
                                                  term x[(3x)!/11!] = 3% @ P11) = (12)8")(x!?/121),
                                                                                          and here the coefficient
                                                  of x!7/12! is (12)(3"); and

iii)    (x? /2) (>      ( 0     —For this last summand we observe that
                                                            i=0     qi.

(x? /2)[(3x)!9/10!] = (1/2) (3!) !2/10!) = (1/2)(12)
                                                                                                 11) 3!) (x!2/128),
                                                  where this time the coefficient of x!*/12! is (1/2)(12)(11) 3").
                                           Consequently, the number of 12 flag signals with at least three white flags, or none at
                                           all, is

4? _ 19!) — (1/2)(12)11)
                                                                                        3") = 10,754,218.

Our final example is reminiscent of past results.

A company hires 11 new employees, each of whom is to be assigned to one of four subdi-
    EXAMPLE 9.29                       .                 Le,      .
                                     visions. Each subdivision will get at least one new employee. In how many ways can these
                                     assignments be made?
                                         Calling the subdivisions A, B, C, and D, we can equivalently count the number of 11-
                                     letter sequences in which there is at least one occurrence of each of the letters A, B, C, and
                                     D. The exponential generating function for these arrangements is

f(x) =
                                                          x?      a
                                                                    7     xt                ° =    (e*    — })
                                                                                                                   4 =    4
                                                                                                                         e'*   —
                                                                                                                                      3
                                                                                                                                   4e’*    +
                                                                                                                                                 2
                                                                                                                                               6e*   — 4e*   4+ 1.

The answer then is the coefficient of x!'/11! in f(x):
                                                                                                               4               4
                                                         a! — 43!) + 62!) — 40") = D-D (7                                                 -a",
                                                                                                            i=0                t
                                     This form of the answer should bring to mind some of the enumeration problems in Chap-
                                     ter 5. Once the vocabulary is set aside, we are counting the number of onto functions
                                     g: X -> Y where |X| = 11, |Y| = 4.

EXERCISES 9.4:                                      a) F(x) = 3e"
                                                                                   b) f(x) = 6e* — 3e”
1. Find the exponential generating function for each of the                       c) ff) =e +x?
following sequences.                                                               d) f(x) =e —3x3 45x? 47x
    a)   1,   —1,1,   —1, 1,    —1,...                                             e)       f(x)    =    1/d       —x)

b) 1, 2, 27, 2°, 24...                                                         f) f(x) =3/U-2x)
                                                                                                  +e
     c) 1,-a,a*,—a*,a*,...,    aeR                                               3. In each of the following, the function f(x) is the expo-
    d) 1, a7, a*,a°,..., acR                                                   nential generating function for the sequence do, a), a2,...,
    e) a,a3,a5,a’,...,    aéeR                                                 whereas the function g(x) is the exponential generating func-
                      5  7                                                     tion for the sequence by, b,, b2,.... Express g(x) in terms of
    f) 0, 1, 2(2), 3(2°), 4(2”),...                                                     ;
                                                                               f(x) if
2. Determine the sequence generated by each of the following                      a) b3 = 3
exponential generating functions.                                                           b, =a,,n EN, an #3
440                Chapter 9 Generating Functions

b) a, =5",nEN                                                                  ii)   MISSISSIPPI
         b,
          = —-                                                                      iii)   ISOMORPHISM
         b, =a,,nEN,n
                    #3                                                        b) For section (ii) of part (a), what is the exponential gener-
      c) b, =2                                                                ating function if the arrangement must contain at least two
         by   =4                                                              Ps?
         b, = 2a,,n€N,n#1,2                                               7. Say the company in Example 9.29 hires 25 new employ-
      d) b; =2                                                          ees. Give the exponential generating function for the number
         by   =4                                                        of ways to assign these people to the four subdivisions so that
         b,   =8                                                        each subdivision receives at least 3, but no more than 10, new
         b, = 2a, +3,nEN,n              #1, 2,3                         people.
4. a) For the ship in Example 9.28, how many signals use at              8. Given         the    sequences   dp, a), @,...   and   bo, bj, bo, ...,
      least one flag of each color? (Solve this with an exponential     with exponential generating functions f(x), g(x), respectively,
      generating function.)                                             show that if A(x) = f(x)g(x), then A(x) is the exponential
                                                                        generating function of the sequence Cp, ¢€), C2, ..., wherec,              =
      b) Restate part (a) in an alternative way that uses the con-
                                                                         ro     (Jai ba+, for each n > 0.
      cept of an onto function.
                                                                          9. If a 20-digit ternary (0, 1, 2) sequence is randomly gener-
      c) How many signals are there in Example 9.28, where the
      total number of blue and black flags is even?                     ated, what is the probability that: (a) It has an even number of
                                                                        1’s? (b) It has an even number of 1’s and an even number of
5. Find the exponential generating function for the sequence           2’s? (c) It has an odd number of 0’s? (d) The total number of 0’s
O!, 1! 2!, 3h...                                                        and 1’s is odd? (e) The total number of 0’s and |’s is even?
6. a) Find the exponential generating function for the number           10. How many 20-digit quaternary (0, 1, 2, 3) sequences are
    of ways to arrange n letters, n > 0, selected from each of          there where: (a) There is at least one 2 and an odd number of
      the following words.                                              0’s? (b) No symbol occurs exactly twice? (c) No symbol occurs
           i) HAWAII                                                    exactly three times? (d) There are exactly two 3’s or none at all?

9.5
               The Summation Operator
                                    This final section introduces a technique that helps us go from the (ordinary) generat-
                                    ing function for the sequence ao, a1, a2, ... to the generating function for the sequence
                                    a, 49 + 41,49 +a, +ay,....
                                         For f(x) = ao + a,x + ax? + a3x? +--+, consider the function f(x)/(1 — x).

f(x)              1
                                          Fay
                                          l—x
                                                    LO) a         = lao tax + ane? tage? te                          txt        pe po]
                                                  = ay + (ao + a1)x + (ao + a1 + a2)x? + (ap +4) +2 + .a3)xX7F +-+-,

so f (x)/(1 — x) generates the sequence of sums ag, ay + a), Qo + A, +42, Ag +a,                        +a.+
                                    a3, .... This is why we refer to 1/(1 — x) as the summation operator. Furthermore we see
                                    that the sequence do, dp + 41, Q) +a;      + 2,4              +a;   +a.   +43,   ... 1s the convolution of
                                    the sequence ap, a), a2, ... and the sequence bo, bj, bo, ..., where 6, = 1 foralln EN.
                                         We find this technique handy in the following examples.

a) We know from part (b) of Example 9.5 that 1/(1 — x) is the generating function
      EXAMPLE 9.30
                                           for the sequence 1, 1, 1, . .. . Consequently, upon applying the summation operator,
                                            1/(1 ~— x), we see that (1/(1 — x))(1/(1             — x)) is the generating function for the se-
                                           quence 1,14+1,1+1+41,...—that is, 1/(1 — x)? is the generating function for
                                           the sequence 1, 2, 3,..., as we found in part (c) of Example 9.5.
                                                                                                   9.5 The Summation Operator                    441

b) Now       let us       start with        the polynomial           x + x7,      the     generating     function         for the
                      sequence           0,1,1,0,0,0,....                    Applying        the      summation          operator,       we     have
                      (x +. x*)(1/(1 — x)) = (« + x”)/(1 — x), the generating function for the sequence
                      0,04+1,04+14+1,0+1+4+1+0,...,—that is, the sequence 0,1,2,2,....A
                      second application of the summation operator tells us that (x + x*)/(1 — x)* is the
                      generating function for the sequence 0,04+ 1,04+1+2,04+1+242,...,—
                      that is, the sequence 0, 1,3,5,.... A                            final application of the summation                 operator
                      tells us        that    (x +x*)/(1— x)?                 is the    generating          function    for the      sequence      0,
                      04+1,04143,04+1+4+3+5,...,—that                                         is, the       sequence     0, 1,4,9,....           This
                      suggests that, forn > 1, )0¢_,(2k — 1) =n”. To verify this suggestion, we look at
                      the coefficient of x” in (x +x?)/(1—xy =x(1 —x) 7% +x7(1 — x)7*. The coeffi-
                      cient of x”~! in (1 — x)~3 [which is the coefficient of x” in x(1 — x)7>] is

(2 Jeomt=can CFO 2 2 ert = (0) = set nen.
                       n~-1                                                    n—l                                     n—l           2

The coefficient of x”~? in (1 — x)~> [which is the coefficient of x” in x°(1 — x)77]
                      is { 2)(-h"?                 = (yr             Peer? ~ ip                         = (7%) = 3(2)(n — 1). Conse-
                      quently, for n > 1, )°7_,(2k — 1) = the coefficient of x” in (x + x*)/(1 — x)? =
                      AG + 1)(n)+ S(n)(n —l)= (nf (n +1)+(n—1)] =n’, as we learned earlier
                      in Example 4.7, using the Principle of Mathematical Induction.

Our last example provides us with a method for deriving some of the summation formu-
               las we encountered in earlier chapters.

Find a formula to express 07 + 17 + 27 4+---+n? asa function of n.
EXAMPLE 9.31      As in Section 9.2, we start with g(x) = 1/(1 —x) =14+x4+2x?+---.Then

sox/(1 — x)? is the generating function for 0, 1, 2, 3, 4, .... Repeating this technique, we
               find that

*
                                          d             dg(x)                x(1 +x) ay 42g?
                                                                           = ON           2,2 ,4 32x
                                                                                                   22.3 ee,
                                      “Tx      E    (     dx     )           (1-x)     *   *        *

so x(1 + x)/(1 ~ x)? generates 0°, 17, 27, 3°, .... As a consequence of our earlier obser-
               vations about the summation operator, we find that

x(+x)              1         | xQl+x)
                                                                (1—x)3        GQ —-x)         (1 —x)*

is    the     generating            function          for      07,0? +1°,0?+12+427,0?4+17+4+27+437,....
               Hence the coefficient of x” in [x(1 +.x)]/(1 — x)* is                                  an i>. But the coefficient of x” in
               [x(1 + x)]/(1 — x)‘ can also be calculated as follows:
                                                           _                           —4             —4                  —4
               x(1+ x)
                                                                                             +(        I Joos            (FJevts-],
               (1 ~—x)4      =   (x   +x*)(1       — x)        t=649/()
442          Chapter 9 Generating Functions

so the coefficient of x” is

(,-,)
                                       4
                                            1) +(,"4)
                                           _4yn-l  —4
                                                                              1)
                                                                            _y\n-2

ye (' +(n-1)- Nene                              1 cyr(4 - " -2)-                eae
                                                                           n—1                                                   2
                                                         (FI            (**: )-            (n + 2)!        (n+ 1)!
                                                                                ~        3N(n —1)!

I
                                                           an            n—2                              3!(n — 2)!

[nm + 2)(n+ 1)(™) + (24+        I(r)— 1))
                                                         Al
                                                    |

a(n + 1)(2n +1)
                                                               (n)(n + 1)[@+2)4+ (xn -1)) = —————
                                                         Alm
                                                    {|

5. Let f(x) be the generating function for the sequence dp, a1,
                          EXERCISES 9.5                                            a), .... For what sequence is (1 — x) f (x) the generating func-
                                                                                   tion?
1. Find the generating function for the sequences (a) 1, 2, 3, 3,
                                                                                   6. Let f(x) =       eo a,x' with f(1) = yoo a,, a finite num-
3,...3(b)
   1, 2,3, 4,4,4,...;(¢) 1,4, 7, 10, 13,....
                                                                                   ber. Verify that the quotient [ f(x) — f(1)]/(x — 1) is the gen-
2. a) Find the generating function for the sequences (i) 0, 1, 0,                  erating function for the sequence sp, 5), 52,..., where s, =
   0,0,...;@i)0, 1,1, 1, 1,...5 @ii) 0, 1, 2, 3,4, ...;
                                                                                      nt    a,ne N.
   (iv)0, 1, 3,6, 10,....
                                                                                   7. Find the generating function for the sequence a, @), a2, ...,
   b) Use result (iv) from part (a) to find a formula for }°,_, k.                 where a, = ¥)"_)(1/i), 2 EN.
3. Continue the development of the ideas set forth in Example                      8. a) Find the generating function for the sequence 0, 1, 3, 6,
9.31 and derive the formula }°"_, i° = [n(n + 1)/2P.                                  10, 15,... (where 1, 3,6, 10, 15,... are the triangular
4. If f(x) = 0, a,x”, what is the generating function for the                         numbers of Example 4.5).
sequence dy, dy + Gy, 4) +42, G2 + a3, ... ? Whatis the gener-                        b) For     € Z*, determine a formula for the sum of the first
ating function for the sequence ag, ay + a, Go + A) + 2, a) +                         n triangular numbers.
a + a3, ay + a3 +4, ...? What is the generating function for
the sequence 4a7, 407+   ay9,97a    75ay 797,97
                                           420 at ¢5a2 + 83Fo---? 9

9.6
        Summary and Historical Review
                                    In the early thirteenth century the Italian mathematician Leonardo of Pisa (c. 1175~1250),
                                    in his Liber Abaci, introduced the European world to the Hindu-Arabic notation for nu-
                                     merals and algorithms for arithmetic. In this text he also originated the study of the se-
                                     quence 0, 1, 1, 2, 3, 5, 8, 13, 21, ... , which can be given recursively by Fy = 0, Fi = 1,
                                     and Frys. = Fai) + Fy, n = 0. Since Leonardo was the son of Bonaccio, the sequence has
                                     come to be called the Fibonacci numbers. (Filius Bonaccii is the Latin form for “son of

He) 9)
                                     Bonaccio.’’)
                                         If we consider the formula

1ff/ievs\)                     f1-v5\"

we find Fp = 0, F; = 1, Fo = 1, F3 = 2, Fy =3,.... Yes, this formula determines each
                                     Fibonacci number as a function of n. (Here we have the solution for the recursive Fibonacci
                                     relation. We shall learn more about this in the next chapter.) This formula was not derived,
                                                             9.6 Summary and Historical Review   443

however, until 1718, when Abraham           DeMoivre (1667-1754) obtained the result from the
generating function
                        x               1                1                             1
          [O=         Te                         (4).                   1-(4),

2                             2

Extending the existing techniques of the generating function, Leonhard Euler (1707-
1783) advanced the study of the partitions of integers in his 1748 two-volume opus, /ntro-
ductio in Analysin Infinitorum. With
                                  1          1       1
                                                                 =]
                                                                        foe)
                                                                                 1
                      P(x) =
                               l—-x1l—x?71-—x3                         i=1     1—x'’

we have the generating function for p(0), p(1), p(2),..., where p(n) is the number of
partitions of n into positive summands and p(0) is defined to be 1.

Leonhard Euler (1707-1783)

In the latter part of the eighteenth century, further developments on generating functions
arose in conjunction with ideas in probability theory, especially with what is now called the
“moment generating function.” These related notions were presented in their first complete
treatment by the great scholar Pierre-Simon de Laplace (1749-1827) in his 1812 publication
Théorie Analytique des Probabilités.
    Finally, we mention Norman Macleod Ferrers (1829-1903), after whom the diagram we
called the Ferrers graph is named.
    For us the study of the ordinary and exponential generating functions provided a powerful
technique that unified ideas found in Chapters I, 5, and 8. Extending our prior experience
with polynomials to power series, and extending the binomial theorem to (1 + x)” for
the cases where n need not be positive or even an integer, we found the necessary tools
to compute the coefficients in these generating functions. This was more than worth the
effort because the algebraic calculations we performed took into account all of the selection
444   Chapter 9 Generating Functions

processes we were trying to consider. We also found that we had seen some generating
                       functions in a prior chapter and saw how they arose in the study of partitions.
                           The concept of a partition of a positive integer now enables us to complete the summaries
                       of our earlier discussions on distributions, as given in Tables 1.11 and 5.13. Here we can now
                       deal with the distributions of m objects into n (< m) containers for the cases where neither
                       the objects nor the containers are distinct. These are covered by the entries in the second
                       and fourth rows of Table 9.5. The notation p(m, 2), which appears in the last column for
                       these entries, is used to denote the number of partitions of the positive integer m into exactly
                       n (positive) summands. (This idea will be examined further in Supplementary Exercise 3
                       of the next chapter.) The types of distributions in the first and third rows of this table were
                       also listed in Table 5.13. We include them here a second time for the sake of comparison
                       and completeness.

Table 9.5
                        Objects Are | Containers Are | Some Container(s)                          Number of
                         Distinct        Distinct        May Be Empty                            Distributions

No                   Yes                Yes                           ("tm      1)
                             No                   No                Yes              (1) p(m), for n = m
                                                                                     (2) p(m, 1) + pm, 2) +---4+
                                                                                         p(m,n),forn      <m

No                Yes                    No              (eee) =a = GI)
                             No                   No                 No                             p(m,n)

For comparable coverage of the material presented in this chapter, the interested reader
                       should consult Chapter 2 of C. L. Liu [3] and Chapter 6 of A. Tucker [8]. The text by
                       J. Riordan [6] has extensive coverage of ordinary and exponential generating functions. An
                       interesting survey article on generating functions, written by Richard P. Stanley, can be found
                       in the text edited by G-C. Rota [7]. The text by H. S. Wilf [9] deals with generating functions
                       and some of the ways they are applied in discrete mathematics. This work also demonstrates
                       how these functions provide a bridge between discrete mathematics and continuous analysis
                       (in particular, the theory of functions of a complex variable).
                           The reader interested in learning more about the theory of partitions should consult
                       Chapter 10 of I. Niven, H. Zuckerman, and H. Montgomery [5].
                           Finally, a great deal about the moment generating function and its use in probability
                       theory can be found in Chapter 3 of H. J. Larson [2] and in Chapter XI of the comprehensive
                       work by W. Feller [1].

REFERENCES
                          1. Feller, William. An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed. New
                             York: Wiley, 1968.
                          2. Larson, Harold J. introduction to Probability Theory and Statistical Inference, 2nd ed. New
                             York: Wiley, 1969.
                          3. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
                          4. Neal, David. “The Series }°°., nx” and a Pascal-like Triangle.” The College Mathematics
                             Journal 25, No. 2 (March 1994): pp. 99-101.
                                                                                                        Supplementary Exercises            445

5. Niven, Ivan, Zuckerman, Herbert, and Montgomery, Hugh. An Introduction to the Theory of
                                        Numbers, 5th ed. New York: Wiley, 1991.
                                     6. Riordan, John. An Introduction to Combinatorial Analysis. Princeton, N.J.: Princeton University
                                        Press, 1980. (Originally published in 1958 by John Wiley & Sons.)
                                     7. Rota, Gian-Carlo, ed. Studies in Combinatorics, Studies in Mathematics, Vol. 17. Washington,
                                        D.C.: The Mathematical Association of America, 1978.
                                     8. Tucker, Alan. Applied Combinatorics, 4th ed. New York: Wiley, 2002.
                                     9. Wilf, Herbert S. Generatingfunctionology, 2nd ed. San Diego, Calif.: Academic Press, 1994.

9. Simplify the following sum where n € Z*: (7) + 2(3) +
             SUPPLEMENTARY EXERCISES                                   3(2) Hee + n(®). (Hint: You may wish to start with the bi-
                                                                       nomial theorem.)

10. Determine the generating function for the number of par-
1. Find the generating    function for each     of the following
                                                                       titions of n € N where 1 occurs at most once, 2 occurs at most
sequences.
                                                                       twice, 3 at most thrice, and, in general, k occurs at most k times,
    a) 7,8,9,10,...                                                    for every k € Z*.
    b) 1,a,a7,a°,a*,...,         aeéeR
                                                                        11. In arural area 12 mailboxes are located at a general store.
    ce) 1,(Qi+a), +a),           +a)y,...,          aeR
                                                                              a) If a newscarrier has 20 identical fliers, in how many
    d)2,1t+a,1+a’?,i+a?,...,                aeéeR                             ways can she distribute the fliers so that each mailbox gets
2. Find the coefficient of x®? in                                            at least one flier?
             fx)   = OO x8      4 xl ge xl 4 xy,                              b) If the mailboxes are in two rows of six each, what is
                                                                              the probability that a distribution from part (a) will have 10
  3. Sergeant Bueti must distribute 40 bullets (20 for rifles and             fliers distributed to the top six boxes and 10 to the bottom
20 for handguns) among four police officers so that each officer              Six?
gets at least two, but no more than seven, bullets of each type.
                                                                       12. Let S be a set containing n distinct objects. Verify that
In how many ways can he do this?
                                                                       e* /(1 — x)* is the exponential generating function for the num-
  4. Find a generating function for the number of ways to parti-       ber of ways to choose m of the objects in S, forO < m <n, and
tion a positive integer n into positive-integer summands, where        distribute these objects among & distinct containers, with the
each summand appears an odd number of times or not at all.             order of the objects in any container relevant for the distri-
5. For n € Z*,    show that the number of partitions of 7 in          bution.
which no even summand is repeated (an odd summand may or                13.   a) For a, d ER,       find the generating   function   for the se-
may not be repeated) is the same as the number of partitions of               quence a,a+d,a+2d,a+3d,....
n where no summand occurs more than three times.
                                                                              b) Forn € Z*, use the result from part (a) to find a formula
  6. How many 10-digit telephone numbers use only the digits                  for the sum of the first n terms of the arithmetic progression
1, 3, 5 and 7, with each digit appearing at least twice or not at             a4,a+d,a+2d,at+3d,....
all?
                                                                        14. a) For the alphabet © = {0, 1}, let a, count the number
7, a) For what sequence of numbers is g(x) = (1 — 2x) §/2                  of strings of length n in &*—that is, for n EN, a, =
    the exponential generating function?                                    |"|. Determine the generating function for the sequence
    b) Find a and B so that (1 — ax)° is the exponential gener-               do, 4], 42,....
    ating function for the sequence    1,7,7-11,7-11-15,....                  b) Answer the question posed in part (a) when            |%| = k,
8. For integers n, k > 0 let                                                 a fixed positive integer.
                                                                       15. Let f(x) = dp + a,x + ax? +a3x°+..., the generating
e P, be the number of partitions of n.
                                                                       function for the sequence do, a, G2, a3,.... Now letn € Z*,
e P, be the number of partitions of 2n +,          where n +k is      n fixed.
   the greatest summand.
                                                                              a) Find the generating function for the sequence 0, 0, 0,
e P; be the number of partitions of 2n + k into precisely                    ... 0, Gg, @), G2, 43, ..., where there are n leading zeros.
   n+k summands.                                                              b) Find the generating function for the sequence a,,, d,41,
   Using the concept of the Ferrers graph, prove that P; = P,
and P; = P3, thus concluding that the number of partitions of           16. Suppose that X is a discrete random variable with proba-
2n + k into precisely n + k summands is the same for all k.             bility distribution given by
446           Chapter 9 Generating Functions

Pr(X =x) =        k(4)",     x =0,1,2,3,...                   for the first mile, two miles per hour for the second mile, four
                         0,          otherwise,                       miles per hour for the third mile, ..., and 2”~' miles per hour
where k is a constant. Determine (a) the value of                &;   for the nth mile.
(b) Pr (X = 3), Pr (X <3), Pr(X > 3), Pr (X > 2); and                      a) Whatis the car’s average velocity for the first four miles?
(c) Pr (X > 4|X > 2), Pr (X > 104|X > 102).                               b) Fora given value of n, what is the car’s average velocity
17. Suppose that Y is a geometric random variable where the               for the first 2 miles?
probability of success for each Bernoulli trial is p. If m,n € Z*         c) Find the smallest value of n for which the car’s average
with m > n, determine Pr (Y > m|Y¥ > nv).                                 velocity for the first n miles exceeds 10 miles per hour.
18. Atest car is driven a fixed distance ofn miles along a straight
highway. (Here n € Z*.) The car travels at one mile per hour
       10
Recurrence
  Relations

n earlier sections of the text we saw some recursive definitions and constructions. In
              Definitions 5.19, 6.7, 6.12, and 7.9, we obtained concepts at level n + 1 (or of sizen + 1)
           from comparable concepts at level n (or of size n), after establishing the concept at a
           first value of n, such as 0 or 1. When we dealt with the Fibonacci and Lucas numbers in
           Section 4.2, the results at level n + 1 turned out to depend on those at levels n and n — 1,
           and for each of these sequences of integers the basis consisted of the first two integers
           (of the sequence). Now we shall find ourselves in a somewhat similar situation. We shall
           investigate functions a(n), preferably written as a, (for n > 0), where a, depends on some
           of the prior terms G,_1, Gy—2, ..., @|, 4g. This study of what are called either recurrence
           relations or difference equations is the discrete counterpart to ideas applied in ordinary
           differential equations.
               Our development will not employ any ideas from differential equations but will start
           with the notion of a geometric progression. As further ideas are developed, we shall see
           some of the many applications that make this topic so important.

10.1
The First-Order Linear
Recurrence Relation
           A geometric progression is an infinite sequence of numbers, such as 5, 15, 45, 135,...,
           where the division of each term, other than the first, by its immediate predecessor is a
           constant, called the common ratio. For our sequence this common ratio is 3: 15 = 3(5), 45 =
           3(15), and so on. If ay, a), a2, .. . iS a geometric progression, then a) /ay = a2/a,     =-+:-: =
           Gn+1/An = ++: =r, the common ratio. In our particular geometric progression we have
           An41 = 3ay,,n = 0.
                The recurrence relation dy4,       = 3da,, n > 0, does not define a unique geometric progres-
           sion. The sequence 7, 21, 63, 189, ... also satisfies the relation. To pinpoint a particular
           sequence described by a,,; = 3a,, we need to know one of the terms of that sequence.
           Hence

Anti = 3an,                 n>=0,    ay = 5,
           uniquely defines the sequence 5, 15, 45, ..., whereas

On+1    =    34&n,     HW    = 0,   a,   =   21,

identifies 7, 21, 63, ... as the geometric progression under study.

447
448         Chapter 10 Recurrence Relations

The equation a,4|       = 3a,, n > 0 is a recurrence relation because the value of a,,+, (the
                             present consideration) is dependent on a,, (a prior consideration). Since a,,, depends only
                             on its immediate       predecessor,    the relation is said to be of first order.   In particular, this
                             is a first-order linear homogeneous recurrence relation with constant coefficients. (We'll
                             say more about these ideas later.) The general form of such an equation can be written
                             Ani) = day, n > 0, where d is a constant.
                                 Values such as a or a1, given in addition to the recurrence relations, are called boundary
                             conditions. The expression a) = A, where A is a constant, is also referred to as an initial
                             condition. Our examples show the importance of the boundary condition in determining the
                             unique solution.
                                Let us return now to the recurrence relation

Any. = 3an,           n>O,        ay = 5.

The first four terms of this sequence are

ay =5,
                                                              a, = 3a9 = 3(5),
                                                              ay = 3a, = 3(3a9) = 3°(5),         and
                                                              a3 = 3a) = 3(37(5)) = 33(5).
                             These results suggest that for each n > 0, a, = 5(3”). This is the unique solution of the
                             given recurrence relation. In this solution, the value of a, is a function of m and there is no
                             longer any dependence on prior terms of the sequence, once we define aj. To compute ajo,
                             for example, we simply calculate 5(3'°) = 295,245; there is no need to start at a9 and build
                             up to ao in order to obtain ajo.
                                 From this example we are directed to the following. (This result can be established by
                             the Principle of Mathematical Induction.)

The unique solution of the recurrence relation
                                              Gn4t = ddan,         wheren >0,       disaconstant,        and   ao = A,
                               is given by
                                                                      dy, = Ad”,      A> Od.

Thus the solution a,, = Ad", n > 0, defines a discrete function whose domain is the set
                             N of all nonnegative integers.

Solve the recurrence relation a, = 7a,_,, where n > | and a2 = 98.
      EXAMPLE 10.1
                                This is just an alternative form of the relation a,,, = 7a, for n > 0 and a2 = 98. Hence
                             the solution has the form a, = a9(7"). Since a2 = 98 = ao(7°), it follows that ay = 2, and
                             an = 2(7"), n = O, is the unique solution.

A bank pays 6% (annual) interest on savings, compounding the interest monthly. If Bonnie
      EXAMPLE 10.2
                             deposits $1000 on the first day of May, how much will this deposit be worth a year later?
                                The annual interest rate is 6%, so the monthly rate is 6%/12 = 0.5% = 0.005. For
                             O<n < 12, let p, denote the value of Bonnie’s deposit at the end of n months. Then
                             Pa+t = Pn + 9.005 p,, where 0.005 p, is the interest earned on p, during month n + 1,
                             forO <n < 11, and pp = $1000.
                                                                 10.1.    The First-Order Linear Recurrence Relation              449

The relation pry = (1.005) p», po = $1000, has the solution p, = py(1.005)" =
               $1000(1.005)".      Consequently,        at     the       end     of   one    year,       Bonnie’s   deposit   is worth
               $1000(1.005)'*
                       = $1061.68.

In the next example we find a fifth way to count the number of compositions of a positive
               integer. The reader may recall that this situation was examined earlier in Examples 1.37,
               3.11, 4.12, and 9.12.

Figure 10.1 provides the compositions of 3 and 4. Here we see that compositions (1’)-(4’)
EXAMPLE 10.3
               of 4 arise from the corresponding compositions of 3 by increasing the last summand (in each
               corresponding composition of 3) by 1. The other four compositions of 4, namely, (1”)-(4”),
               are obtained from the compositions of 3 by appending “+1” to each of the corresponding
               compositions of 3. (The reader may recall seeing such results in Fig. 4.7.)

(1’)             4
                                                                           (2’)              143
                                     (1)          3                        (3’)             2+2
                                     (2)          14+2                      (4’)             1+14+2
                                      (3)         2+1
                                     (4)          14+141             | (1%                  341
                                                                       (2”)                 1+2+1
                                                                       (3”)                 2+14+1
                                                                       (4”)                 1+1+1+1
                                    Figure 10.1

What happens in Fig. 10.1 exemplifies the general situation. So if we let a, count the
               number of compositions of n, for n € Z*, we find that

An+1   = 24,                      n>,           a,      =   1.

However, in order to apply the formula for the unique solution (where n > 0) to this recur-
               rence relation, we let b, = a,41. Then we have

Ba+1   =    2b,                   n>0,          bo   =      1,

so b, = bo(2") = 2", anda, = b,-; = 2"-"',n > 1.

The recurrence relation a,4, — da, = 0 is called linear because each subscripted term
               appears to the first power (as do the variables x and y in the equation of a line in the plane). In
               a linear relation there are no products such as a,@,—, which appears in the nonlinear recur-
               rence relation @,41 — 3a,@,—-,       = 0. However, there are times when a nonlinear recurrence
               relation can be transformed into a linear one by a suitable algebraic substitution.

Find a) if ar   - 5a’, where a, > 0 forn > 0, and ao = 2.
EXAMPLE 10.4
                  Although this recurrence relation is not linear in ay, if we let b, = a?, then the new
               relation b,,;   = 5b, forn > 0, and bo = 4, is a linear relation whose solution is b, = 4-5".
               Therefore, a, = 2(/5)" forn > 0, and a,) = 2(/5)!? = 31,250.
450         Chapter 10 Recurrence Relations

The general first-order linear recurrence relation with constant coefficients has the form
                               Qn+| + Ca, = f(n), n > 0, where c is a constant and f(m) is a function on the set N of
                               nonnegative integers.
                                   When f(z) = 0 for alln €N, the relation is called homogeneous; otherwise it is called
                               nonhomogeneous. So far we have only dealt with homogeneous relations. Now we shall
                               solve a nonhomogeneous relation. We shall develop specific techniques that work for all
                               linear homogeneous recurrence relations with constant coefficients. However, many differ-
                               ent techniques prove useful when we deal with a nonhomogeneous problem, although none
                               allows us to solve everything that can arise.

Perhaps the most popular, though not the most efficient, method of sorting numeric data
      EXAMPLE 10.5
                               is a technique called the bubble sort. Here the input is a positive integer n and an array
                               X), X2,X3,..., Xp, of real numbers that are to be sorted into ascending order.
                                    The pseudocode procedure in Fig. 10.2 provides an implementation for an algorithm to
                               carry out this sorting process. Here the integer variable 7 is the counter for the outer for
                               loop, whereas the integer variable j is the counter for the inner for loop. Finally, the real
                               variable temp is used for storage that is needed when an exchange takes place.

procedure            BubbleSort(n:                   positive     integer;   xX),X2,X3,...,X,:
                                                                                                               real numbers)
                        begin
                            fori:=1ton-—i1do
                                 forj :=ndowntoi+1do
                                     if x,     < x,_;           then
                                        begin                        {interchange}
                                              Cemp         :=    Xy-)
                                             X),-]        t=    X,

xX,     :=    temp
                                        end
                         end

Figure 10.2

We compare the last entry, x,, in the given array with its immediate predecessor, x, _1. If
                               Xn < Xn—1, we interchange the values stored in x,—-; and x,. In any event we will now have
                               Xn-1 <X,. Then we compare x,_;                         with its immediate predecessor, x,~2. If X,~1   < Xp_2,
                               we interchange them. We continue the process. After n — | such comparisons, the smallest
                               number in the list is stored in x;. We then repeat this process for the n — 1 numbers now
                               stored in the (smaller) array x2, x3, ..., Xn. Inthis way, each time (counted by /) this process
                               is carried out, the smallest number in the remaining sublist “bubbles up” to the front of that
                               sublist.
                                    Asmall example wherein x = 5 and x, = 7, x2 = 9,.x3 = 2,x4 = 5, and x5 = 8 is given
                               in Fig. 10.3 to show how the bubble sort of Fig. 10.2 places a given sequence in ascending
                               order. In this figure each comparison that leads to an interchange is denoted by the symbol
                               2; the symbol } indicates a comparison that results in no interchange.
                                   To determine the time-complexity function h(n) when this algorithm is used on an input
                               (array) of size n > 1, we count the total number of comparisons made in order to sort the n
                               given numbers into ascending order.
                                   If a, denotes the number of comparisons needed to sort n numbers in this way, then we
                               get the following recurrence relation:

Qn   = Gn-1   + (n — 1),      n> 2,       a;   = 0.
                                                    10.1       The First-Order Linear Recurrence Relation   451

i=1]|       x,           7                   7          7          7          2
                               Xp           9                   9          5)    3    2 )i=2     7

Xa           5].            5    5
                                                                    j=4    5          5          5
                                                   J=

Xs               8                   8          8          8          8
                                   Four comparisons and two interchanges.

x            7                  7           7          5

X3               9                  9 y=4       3/3        7

X4           ‘                   5          9          9
                                                   j=5
                           Xs               8                  8          8          8
                                   Three comparisons and two interchanges.

X2               5                   5          5

x                7                  7           7
                               ,4           9 i     =5         i     “4   8
                           Xs               8                  9           9

Two comparisons and one interchange.

X2               5

X3               7
                           X4               8 hi    _s
                           Xs               9

One comparison but no interchanges.

Figure 10.3

This arises as follows. Given a list of numbers, we make n — 1 comparisons to bubble
the smallest number up to the start of the list. The remaining sublist of n — 1 numbers then
requires a, —; Comparisons in order to be completely sorted.
   This relation is a linear first-order relation with constant coefficients, but the term n ~— 1
makes it nonhomogeneous. Since we have no technique for attacking such a relation, let us
list some terms and see whether there is a recognizable pattern.

a, =0

a2=a,+(2—1)=1
                                       a3       agt+(3-—1)=142
                                       a4=a,+ (4-1                    =14+24+3

In general, ay = 1+2+---+(n—1)
                       =[(n — 1)n]/2 = (nr? — n)/2.
452         Chapter 10 Recurrence Relations

As a result, the bubble sort determines the time-complexity function h: Z* + R given
                             by h(n) = ay, = (n? — n)/2. [Here h(Z*) CN_] Consequently, as a measure of the running
                             time for the algorithm, we write h € O(n’). Hence the bubble sort is said to require O(n’)
                             comparisons.

EXAMPLE   10.6         In part (c) of Example 9.6 we sought the generating function for the sequence 0, 2, 6, 12,
                   :         20, 30, 42, ..., and the solution rested upon our ability to recognize that a, =n? +n for
                             each n € N. If we fail to see this, perhaps we can examine the given sequence and determine
                             whether there is some other pattern that will help us.
                               Here   ag    = 0,     a,    = 2,   aa   = 6,       a3         12,   aq   =   20,   a5   =   30,   a>     42,   and

aj   —ayp    = 2                   a4—-a,         =6              as   —   a4, =    10

a) —a,       =4                    a4 —a3         =8              ag — as = 12.

These calculations suggest the recurrence relation

Qn
                                                                   — An,               = 2h,                n> 1,            ag
                                                                                                                              = 0.

To solve this relation, we proceed in a slightly different manner from the method we used
                             in Example 10.5. Consider the following n equations:

Qy     —a       =2

a2 —a,=4
                                                                                        a3     —-a,=6

An — An-|            = 2n.

When we add these equations, the sum for the left-hand side will contain a; and —a; for all
                             1<i<n-—1.So              we obtain

Q,—- a9 =24+44+64+---+2n                                  =2114+24+3+4+---+n)
                                                                  = 2[n(n
                                                                       + 1)/2] =n? +n.

Since ay = 0, it follows that a, = n* +n                               for all n EN,          as we found earlier in part (c) of
                            Example 9.6.

At this point we shall examine a recurrence relation with a variable coefficient.

Solve the relation ad, = n+ d,_;, where n > 1 and ap = 1.
      EXAMPLE   10.7            Writing the first five terms defined by the relation, we have

ay     = 1                            a2   =2-a,=2:-1                               a4=4-a,=4-3-2-]

aq, =1-a=1                            43   =3-a,=3:-2-1

Therefore, a, = n! and the solution is the discrete function a,,, which counts the number
                            of permutations of n objects, n > 0.
                                                                 10.1   The First-Order Linear Recurrence Relation            453

While on the subject of permutations, we shall examine a recursive algorithm for gen-
                 erating the permutations of {1, 2,3,...,—                    1, n} from those for {1, 2,3,...,n—1}."
                 There is only one permutation of {1}. Examining the permutations of {1, 2},
                                                                          1      2
                                                                 2        1
                 we see that after writing the permutation | twice, we intertwine the number 2 about | to get
                 the permutations listed. Writing each of these two permutations three times, we intertwine
                 the number 3 and obtain
                                                                 I               2       3
                                                                 1        3      2
                                                         3       1               2
                                                         3       2                ]
                                                                 2        3       1
                                                                 2               ]       3
                     We see here that the first permutation is 123 and that we obtain each of the next two
                 permutations from its immediate predecessor by interchanging two numbers: 3 and the
                 integer to its left. When 3 reaches the left side of the permutation, we examine the remaining
                 numbers and permute them according to the list of permutations we generated for {1, 2}.
                 (This makes the procedure recursive.) After that we interchange 3 with the integer on its
                 right until 3 is on the right side of the permutation. We note that if we interchange 1 and 2
                 in the last permutation, we get 123, the first permutation listed.
                     Continuing for S$ = {1, 2, 3, 4}, we first list each of the six permutations of {1, 2, 3} four
                 times. Starting with the permutation 1234, we intertwine the 4 throughout the remaining
                 23 permutations as indicated in Table 10.1 (on page 454). The only new idea here develops
                 as follows. When progressing from permutation (5) to (6) to (7) to (8), we interchange 4
                 with the integer to its right. At permutation (8), where 4 has reached the right side, we
                 obtain permutation (9) by keeping the location of 4 fixed and replacing the permutation
                 132 by 312 from the list of permutations of {1, 2, 3}. After that we continue as for the first
                 eight permutations until we reach permutation (16), where 4 is again on the right. We then
                 permute 321 to obtain 231 and continue intertwining 4 until all 24 permutations have been
                 generated. Once again, if 1 and 2 are interchanged in the last permutation, we obtain the
                 first permutation in our list.
                      The chapter references provide more information on recursive procedures for generating
                 permutations and combinations.

We shall close this first section by returning to an earlier idea
                                                                                   — the greatest common
                 divisor of two positive integers.

Recursive methods are fundamental in the areas of discrete mathematics and the analysis
i EXAMPLE 10.8   of algorithms. Such methods arise when we want to solve a given problem by breaking it
                 down, or referring it, to smaller similar problems. In many programming languages this can
                 be implemented by the use of recursive functions and procedures, which are permitted to
                 invoke themselves. This example will provide one such procedure.

"The material from here to the end of this section is a digression that uses the idea of recursion. It does not
                 deal with methods for solving recurrence relations and may be omitted with no loss of continuity.
454   Chapter 10   Recurrence Relations

Table 10.1

(1)

OW
                                                                                 LD
                                                (2)

W
                                                                                 WNNN
                                               (3)

NNN  NY WW
                                                (4)

ee
                                                ()       4
                                                (6)

NNN
                                                                                 WWW
                                                (7)

ee
                                                (8)
                                               (9)

UD
                                              (10)

ee
                                                               OO
                                              (11)

DO
                                              (15)

Fe
                                                                                 -wWNL
                                              (16)

op
                                              (17)


                                                                                                    :
                                              (22)

bh

bt pt

www
                                              (23)

a
                                              (24)

In computing gcd(333, 84) we obtain the following calculations when we use the Ev-
                        clidean algorithm (presented in Section 4.4).

333 = 3(84) + 81                  0< 81 < 84                           (1)

84 = 1(81) +3                    0<3<8l                               'e)

81 = 27(3) + 0.                                                       (3)

Since 3 is the last nonzero remainder, the Euclidean algorithm tells us that
                        gcd(333, 84) = 3. However, if we use only the calculations in Eqs. (2) and (3), then we find
                        that gcd(84, 81) = 3. And Eg. (3) alone implies that gcd(81, 3) = 3 because 3 divides 81.
                        Consequently,

gcd(333, 84) = ged(84, 81) = ged(81, 3) = 3,
                        where the integers involved in the successive calculations get smaller as we go from Eq. (1)
                        to Eq. (2) to Eq. (3).
                            We also observe that

81 = 333 mod 84            and              3 = 84 mod              81.

Therefore it follows that

gcd(333, 84) = gced(84, 333 mod 84) = gcd(333 mod 84, 84 mod (333 mod 84)).

These results suggest the following recursive method for computing gcd(a, b), where
                        a,beZ.
                            Say we have the input a, b € Z*.
                            Step 1: If b|a (or a mod 5 = 0), then ged(a, b) = b.
                            Step 2: If b } a, then perform the following tasks in the order specified.
                                      i) Seta = b.
                                                                                      10.1   The First-Order Linear Recurrence Relation                    455

ii) Set b = a mod b, where the value of a for this assignment is the old value
                                                   of a.
                                               iii) Return to step (1).

These ideas are used in the pseudocode procedure in Fig. 10.4. (The reader may wish to
                               compare this procedure with the one given in Fig. 4.11.)

procedure   gcd2 (a, b: positive                        integers)
                                                           begin
                                                             if amodb = 0 then
                                                                gcd=b
                                                             else gcd = gcd2(b,                  amod b)
                                                           end

Figure 10.4

8. For the implementation of the bubble sort given in Fig. 10.2,
                                                                          the outer for loop is executed n — | times. This occurs regard-
                                                                          less of whether any interchanges take place during the exe-
  1. Find a recurrence relation, with initial condition, that
                                                                          cution of the inner for loop. Consequently, for i = k, where
uniquely determines each of the following geometric progres-
                                                                          1<k           <n — 2,    if the execution        of the inner for loop     results
sions.                                                                    in     no      interchanges,      then   the   list is in   ascending   order.    So
    a) 2, 10, 50, 250,...                                                 the execution of the outer for loop fork + 1 <i <n — 1 isnot
    b) 6, —18, 54, —162,...                                               needed.

c) 7, 14/5, 28/25, 56/125,...                                                 a) For the situation described here, how many unnecessary
2. Find the unique solution for each of the following recur-                     comparisons are made if the execution of the inner for loop
rence relations.                                                                  fori =k (1 <k <n — 2) results in no interchanges?

a) Gna, ~— 1.54, = 0,   n>O                                                   b) Write an improved version of the bubble sort shown in
    b) 4a, — Sa,_; =0,      n> 1                                                  Fig. 10.2. (Your result should eliminate the unnecessary
                                                                                  comparisons discussed at the start of this exercise.)
    C) 34,4; — 4a, =0,      n=O,     a, =5
                                                                                  c) Using        the number        of comparisons       as a measure        of
    d) 2a, —3a,-)    =0,    n>1,     ag = 81
                                                                                  its running       time,    determine     the best-case    and the worst-
  3. If a,, n > 0, is the unique solution of the recurrence rela-                 case time complexities for the algorithm implemented in
tion a,4; — da, = 0, and a3 = 153/49, as = 1377/2401, what                        part (b).
is d?
4. The number of bacteria in a culture is 1000 (approximately),               9. Say        the permutations      of {1, 2, 3,4, 5}     are generated       by
and this number increases 250% every two hours. Use a recur-              the procedure developed after Example 10.7. (a) What is the
rence relation to determine the number of bacteria present after          last permutation in the list? (b) What two permutations precede
one day.                                                                  25134? (c) What three permutations follow 25134?
5. If Laura invests $100 at 6% interest compounded quarterly,
                                                                          10. Fora > 1,apermutation p,, po, p3,.... Pr, of the integers
how many months must she wait for her money to double? (She
                                                                          1,2,3,...,” is called orderly if, for each i = 1, 2,3,...,
cannot withdraw the money before the quarter is up.)
                                                                          n— 1, there exists a j > i suchthat|p, — p,| = 1. [Ifa = 2, the
  6. Paul invested the stock profits he received 15 years ago in          permutations 1, 2 and 2, | are both orderly. When        = 3 we find
an account that paid 8% interest compounded quarterly. If his             that 3, 1, 2 is an orderly permutation, while 2, 3, 1 is not. (Why
account now has $7218.27 in it, what was his initial investment?          not?)] (a) List all the orderly permutations for 1, 2, 3. (b) List all
  7. Let x}, %2,..., X29 be a list of distinct real numbers to be         the orderly permutations for 1, 2, 3, 4. (c) If pi, p2, p3, Pa, Ps
sorted by the bubble-sort technique of Example 10.5. (a) After            is an orderly permutation of 1, 2, 3, 4, 5, what value(s) can p;
how many comparisons will the 10 smallest numbers of the orig-            be? (d) For n > 1, let a, count the number of orderly permu-
inal list be arranged in ascending order? (b) How many more               tations for 1, 2, 3,..., a. Find and solve a recurrence relation
comparisons are needed to finish this sorting job?                        for a,.
456         Chapter 10 Recurrence Relations

10.2
           The Second-Order Linear
           Homogeneous Recurrence
      Relation with Constant Coefficients
                             Let k € Z* and Co (4 0), Cy, Co, ..., Cx (4 0) be real numbers.                      If a,, for n > 0, is a
                             discrete function, then

Coan + Cray) + Codn—2 ++ ++ + Cpdn-z = f(r),                          n>k,
                             is a linear recurrence relation (with constant coefficients) of order k. When                  f (n) = 0   for
                             all n > 0, the relation 1s called homogeneous; otherwise, it is called nonhomogeneous.
                                 In this section we shall concentrate on the homogeneous relation of order two:

Codn + Cidn—1 + Cran-2 = 0,                   n> 2.
                             On the basis of our work in Section 10.1, we seek a solution of the form a, = cr”, where
                             ec #OQOandr £0.
                                 Substituting a, = cr” into Cody, + Ciay-1 + Cran—2 = 0, we obtain

Coer"    +   Cyer™!     +   Cer”?    =   0.

With c, r # 0, this becomes Cor? + Cir + C2 = 0, a quadratic equation which is called
                             the characteristic equation. The roots r,, r2 of this equation determine the following three
                             cases:   (a) rj, 2   are distinct real numbers;         (b) 7;, 72 form    a complex    conjugate pair; or
                             (c) r], ro are real, but r; = ro. In all cases, r; and rz are called the characteristic roots.

Case (A): (Distinct Real Roots)
                             Solve the recurrence relation a, + @,_)          — 6a,-2 = 0, where n > 2 and ay = —1, a, = 8.
      EXAMPLE 10.9
                                If a, = cr" with c, r # 0, we obtain cr” + cr”! — 6cr"~? = 0 from which the charac-
                             teristic equation r? +r — 6 = 0 follows:
                                                       0=r+r-6=(r+3)(r—2)>r
                                                                        =2, 3.
                                 Since we have two distinct real roots, a, = 2” and a, = (--3)" are both solutions [as are
                             5(2") and d(—3)", for arbitrary constants b, d]. They are linearly independent solutions
                             because one is not a multiple of the other; that is, there is no real constant k such that
                             (—3)" = k(2") for all n EN." We write a, = c(2”) + c2(—3)" for the general solution,
                             where c), c2 are arbitrary constants.
                                With ay = ~1 and a, = 8, c, and c2 are determined as follows:

—1 = ay = €1(2°) + €(-3)9 = 1 +p
                                                            8 = ay = c1(2!) + er(-3)! = 2c; — 3e2.
                             Solving this system of equations, one finds cy = 1, cp = —2. Therefore, a, = 2” — 2(—3)",
                             n > 0, is the unique solution of the given recurrence relation.
                                The reader should realize that to determine the unique solution of a second-order linear
                             homogeneous recurrence relation with constant coefficients one needs two initial conditions

*We can also call the solutions Gy, = 2” and a, = (—3)" linearly independent when the following condition
                             is satisfied: For k;, kp € R, if k)(2”) + ko(—3)" = 0 for all n EN, then ky = ko = 0.
                   10.2 The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients                                           457

(values) — that is, the value of a, for two values of n, very oftenn = Oandn                                             = l,orn   =1
                  and n = 2.

An interesting second-order homogeneous recurrence relation is the Fibonacci relation.
                  (This was mentioned earlier in Sections 4.2 and 9.6.)

| EXAMPLE 10.10   Solve the recurrence relation Fy,12 = Fy;                              + Fy, wheren
                     As in the previous example, let F, = cr”, forc, r # 0, n > 0. Upon substitution we get
                                                                                                                      > Oand Fy = 0, F) = 1.

ae
                  cr"+? = er"+! 4 cr". This gives the characteristic equation r* — r — 1 = 0. The character-
                  istic roots are r = (1+ /5)/2, so the general solution is

14/5\"                              1— /5\"

To solve for cj, cz, we use the given                               initial values and write 0 = Fo =c;                  +¢2,   1 =
                  Fi = c[(1 + V5)/2]
                                  + eof — V5)/2]. Since —c) = c2, we have 2 = ¢(1+ V5) —
                  c1(1 — 5) and c, = 1/+/5. The general solution is given by

1[fievs\o                               fi-vs\"
                                        real) (5)                                                                     ]           me
                  When dealing with the Fibonacci numbers one often finds the assignments aw = (1 + J/5)/2
                  and B = (1 — V/5)/2, where @ is known as the golden ratio. As a result, we find that

F,    n   =
                                                                    (a” — B")
                                                                        a    —_—             —
                                                                                                 oe              5         n>0.
                                                               V5                                  a—Bp
                  [This representation is referred to as the Binet form                                       for F,,, as it was first published in 1843
                  by Jacques Philippe Marie Binet (1786-1856). ]

For n > 0, let S = {1, 2,3,...,} (when n = 0, S = @), and let a, denote the number
  EXAMPLE 10.11
                  of subsets of S that contain no consecutive integers. Find and solve a recurrence relation
                  for ap.
                      For 0<n    <4,        we        have     ap = 1, a; = 2, ay = 3, a3 = 5, and                                aq = 8. [For example,
                  a3; = 5 because S = {1, 2, 3} has J, {1}, {2}, {3}, and {1, 3} as subsets with no consecutive
                  integers (and no other such subsets).] These first five terms are reminiscent of the Fibonacci
                  sequence. But do things change as we continue?
                      Let n > 2 and S = {1,2,3,...,n~—2,n—1,n}. If ACS and A is to be counted in
                  dy, there are two possibilities:

a) n € A: When this happens (n — 1) ¢ A, and A — {n} would be counted in a,_2.
                    b) n ¢ A: For this case A would be counted in a,_1.
                     These two cases are exhaustive and mutually disjoint, so we conclude that a, = a,_; +
                  An—2, where n > 2 and ap = |, a; = 2, is the recurrence relation for the problem. Now we
                  could solve for a,, but if we notice that a, =                                  F,42,        > 0, then the result of Example 10.10
                  implies that

1}            (/14V5\"                          [1 -5\""
                                                                                   n+2                               n+2

an   -    =                                           —                                  3      n>.
                                             J5                     2                                     2
458         Chapter 10 Recurrence Relations

Suppose we have a2 X n chessboard, forn € Z*. The case forn = 4 is shown in part (a) of
      EXAMPLE 10.12
                             Fig. 10.5. We wish to cover such a chessboard using 2 X 1 (vertical) dominoes, which can
                             also be used as 1 X 2 (horizontal) dominoes. Such dominoes (or tiles) are shown in part (b)
                             of Fig. 10.5.

(a)                                    (b)                          ()
                             Figure 10.5

Forn € Z* we let b, count the number of ways we can cover (or tile) a 2 X n chessboard
                             using our 2 X 1 and 1 X 2 dominoes. Here b; = 1, fora 2 X 1 chessboard necessitates one
                             2 X 1 (vertical) domino. A2 X 2 chessboard can be covered in two ways — using two 2 X 1]
                             (vertical) dominoes or two 1 X 2 (horizontal) dominoes, as shown in part (c) of the figure.
                             Hence 6) = 2. Forn > 3, consider the last (nth) column of a2 X n chessboard. This column
                             can be covered in two ways.
                                     i) By one 2 X | (vertical) domino: Here the remaining 2 X (nm — 1) subboard can be
                                       covered in b,_1 ways.
                                 ii) By the right squares of two 1 X 2 (horizontal) dominoes placed one above the other:
                                       Now the remaining 2 X (n — 2) subboard can be covered in b,_2 ways.

Since these two ways have nothing in common and deal with all possibilities, we may write

Dy   =   by)   +   On-2,   n >   3,   bh   = 1,        bz   = 2.

We find that b, = F,,.1, so here is another situation where the Fibonacci numbers arise. The
                             result from Example 10.10 gives us b, = (1//5)[((1 + /5)/2)"*! — (1 = V5) /2)"*4],
                             n>,

At this point we examine an interesting application where the number a = (1 + /5)/2
      EXAMPLE 10.13
                             plays a major role. This application deals with Gabriel Lamé’s work in estimating the num-
                             ber of divisions used in the Euclidean algorithm to find gcd(a, b), where a, b € Z* with
                             a > b> 2. To find this estimate we need the following property of the Fibonacci numbers,
                             which can be established by the alternative form of the Principle of Mathematical Induction.
                             (A proof is requested in the Section Exercises.)
                             Property: For n > 3, Fy > a"~?.

Addressing the problem at hand  — namely, estimating the number of divisions when
                             the Euclidean algorithm is used to find gcd(a, b)— we recall the following steps from
                             Theorem 4.7.
10.2. The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients           459

Letting ry = a andr; = b, we have

ro=qiri        tro,                O<n <r;

ry, = q2r2 +13,                    O<7r3<1r%
                               r2 = Q3r3 +14,                     0<m4 <7

Yn—-2   =   Gn-1¥n-1     thn,         O<rn    <Pn-1

Vn-1    = Gn¥n-

So ry, the last nonzero remainder, is gcd(a, b).
   From the subscripts on r we see that n divisions have been performed in determining
rn = gcd(a, b). In addition, g; > 1, for all 1 <i <n                —1,   and g, > 2 because r, < ry_}.
Examining the n nonzero remainders ry, rn_1, fn-2, .-- , 72, and r; (= b), we learn that

fn > O,   SOf, > 1 = Fo.

[Gn = 2) A Cn= DIS Pa-                    = Qntn
                                                                 =        2-1 = 2 = F3
                  Fn—2 = Qn-1ln-1 +n 21 rn-1 tin S34                          Fy = Fy

2 = @3r3 tre > le r3trq > Fa-1 + Fa-2 = Fa
                      b=r1=Qr2tr321-nmtr3>
                                      Fat Fai = Fay.
Therefore, if n divisions are performed by the Euclidean algorithm to determine gcd(a, b),
witha > b > 2, thenb > F,4). So by virtue of the property introduced earlier, we may write
b> atD~2 = gt! = [(1 + /5)/2]"-!. Consequently, we find now that
                                                                     —]
               b>a"     | => logiy b > logy(a"!) = (n — 1) logy @ > —

since logig @ = log;,[(1 + V5)/2] = 0.208988 > 0.2 = :
   At this point suppose that 10‘! < b < 10*, so that the decimal (base 10) representation
of b has k digits. Then
                                                            — 1
                   k = logy) 10* > logyb > —,                       and    n<5k+1.

With n, k € Zt we have n < 5k +1 =n                   <5k, and this last inequality now completes a
proof for the following.
Lamé’s Theorem: Let a, b € Z* with a > b > 2. Then the number of divisions needed, in
the Euclidean algorithm, to determine gcd(a, b) is at most 5 times the number of decimal
digits in b.
    Before closing this example, we learn one more fact from Lamé’s Theorem. Since b > 2,
it follows that logjg 6 > logiy 2, so 5 logyg b = 5 logyg 2 = logy, 2° = logy) 32 > 1. From
above we know that n — 1 < 5 logi, b, so
                    n<1+5 logy b <5 logy b +5 logy b = 10 log), b
and n € O(log), b). [Hence, the number of divisions needed, in the Euclidean algorithm,
to determine gcd(a, 6), fora, b € Z* with a > b > 2, is O(log), b)
                                                                 — that is, on the order
of the number of decimal digits in 5.]
460         Chapter 10 Recurrence Relations

Returning to the theme of the section we now examine a recurrence relation in a computer
                             science application.

In many programming languages one may consider those legal arithmetic expressions,
      EXAMPLE 10.14
                             without parentheses, that are made up of the digits 0, 1, 2,..., 9 and the binary operation
                             symbols +, *, /. For example, 3 + 4 and 2 + 3 * 5 are legal arithmetic expressions; 8 + * 9
                             is not. Here 2+ 3 x 5 = 17, since there is a hierarchy of operations: Multiplication and
                             division are performed before addition. Operations at the same level are performed in their
                             order of appearance as the expression is scanned from left to right.
                                 For n € Z*, let a, be the number of these (legal) arithmetic expressions that are made
                             up of n symbols. Then a; = 10, since the arithmetic expressions of one symbol are the 10
                             digits. Next a2 = 100. This accounts for the expressions 00, 01,..., 09, 10, 11,..., 99.
                             (There are no unnecessary leading plus signs.) When n > 3, we consider two cases in order
                             to derive a recurrence relation for a,:

1) If x is an arithmetic expression of n — 1 symbols, the last symbol must be a digit.
                                   Adding one more digit to the right of x, we get 10a,_, arithmetic expressions of n
                                   symbols where the last two symbols are digits.
                                2) Now let y be an arithmetic expression of » — 2 symbols. To obtain an arithmetic
                                   expression with n symbols (that is not counted in case 1), we adjoin to the right of y one
                                      of the 29 two-symbol expressions +1, ..., +9, +0, «1, ..., *9, x0, /1,..., /9.

From these two cases we have a, = 10a,~-, + 29a,_2, where n > 3 and a; = 10, a2 =
                             100.    Here    the characteristic           roots     are 5+3/6      and the solution is a, = (5/(V6)) «
                             [(5 + 36)" — (5 — 3/6)" for n > 1. (Verify this result.)
                                Another way to complete the solution of this problem is to use the recurrence relation
                             Gn = 10a,~) + 29Gn-2, with a2 = 100 and a, = 10, to calculate a value for ay — namely,
                             ay = (a2 — 10a,)/29 = 0. The solution for the recurrence relation

an   =   10a,,—1   +   29a, _>,       n>    2,      ag   =   0,   a,   =   10

dn = (5/3-V6))[(5 + 36)" — (5 -3V6)"],                       n=O.

A second method for counting palindromes arises in our next example.

In Fig. 10.6 we find the palindromes of 3, 4, 5, and 6 — that is, the compositions of 3, 4, 5,
      EXAMPLE 10.15
                             and 6 that read the same left to right as right to left. (We saw this concept earlier in Example
                             9.13.) Consider first the palindromes of 3 and 5. To build the palindromes of 5 from those
                             of 3 we do the following:
                                    i) Add       1 to the first and last summands            in a palindrome of 3. This is how we get
                                      palindromes          (1’) and      (2’) for 5 from     the respective palindromes       (1) and (2) for
                                      3. [Note: When we have a one summand palindrome n we get the one summand
                                      palindrome n + 2. That is how we build palindrome (1’) for 5 from palindrome (1)
                                       for 3.]
                                ii) Append “1+” to the start and ‘‘+ 1” to the end of each palindrome of 3. This technique
                                    generates the palindromes (1”) and (2”) for 5 from the respective palindromes (1)
                                    and (2) for 3.
                              10.2 The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients                461

(1)            3    (1’)                 5         (1)                     4       (1’)                                6
(2)          1+141 ] (2                2+1+2        (2)                 1+2+1     (2’)                              24+2+2
                      (1)               14+341      (3)                    24+2    (3’)                                3+3
                      (2”)           1+14+14+141 11 4                   1+14+141 | @)                             2+1+1+2
                                                                                    1”)                              14441
                                                                                    (2”)                         141424141
                                                                                    (3)                            14+2+4+2+1
                                                                                    (4)                         1+14+141+141
Figure 10.6

The situation is similar for building the palindromes of 6 from those of 4.
                                The preceding observations lead us to the following. Forn € Z*, let p, count the number
                             of palindromes of n. Then

Pn = 2Pn-2,          n> 3,             pi =),             pr
                                                                                                             = 2.

Substituting p, = cr”, for c, r #0, n > 1, into this recurrence relation, the resulting char-
                             acteristic equation is r? — 2 =0. The characteristic roots are r = +                   2,80 Pn = C| (/22)?+
                             C(— J2)".   From

1= pp =c1(V2) + en(—V2)
                                                             2= py = ¢(V2)? + e(~V2)

we find that c; = €        + sn),    C2 = ( -- on).        sO

no (besta) ora (fe sacar wet
                         Unfortunately, this does not look like the result found in Example 9.13. After all, that answer
                         contained no radical terms. However, suppose we consider n even, say n = 2k. Then

ne (Sa gtg) wars ($b) vem
                                                    =
                                                         1       1       +
                                                                                ]
                                                                             (5-55)
                                                                                          1       Qe   =   DK   —   gn/2

(5+"oA
                                                            =A)                          2/2
                         For n odd, say n = 2k — 1, k € Z*, we leave it for the reader to show that p, = 2*-' =
                         nin—D/2,
                             The preceding results can be expressed by p, = 2!"/7), n > 1, as we found in Example
                         9.13.

The recurrence relation for the next example will be set up in two ways. In the first part
                         we shall see how auxiliary variables may be helpful.

| EXAMPLE 10.16 |        Find a recurrence relation for the number of binary sequences of length n that have no
                         consecutive 0’s.
462         Chapter 10 Recurrence Relations

a) For n > 1, let a, be the number of such sequences of length n. Let a)                                      count those
                                   that end in 0, and a!) those that end in 1. Then a, = a                                + a“.
                                        We derive a recurrence relation for a,, n > 1, by computing a, = 2 and then con-
                                   sidering each sequence x of length n — 1 (> 0) where x contains no consecutive 0’s.
                                   If x ends in 1, then we can append a 0 or a | to it, giving us 2a   of the sequences
                                   counted by a,. If the sequence x ends in 0, then only 1 can be appended, resulting in
                                   a,     sequences counted by a,,. Since these two cases exhaust all possibilities and have
                                   nothing in common, we have

n               n—|
                                                                                    +                  N\
                                                                             The ath position         The ath position
                                                                               can be 0 or 1.         can only be 1.

If we consider any sequence y counted in a,_2 we find that the sequence y1 is counted
                                   in a‘.     Likewise,   if the sequence z! is counted in a,                              then z is counted in a,_>.
                                                          qd)
                                   Consequently, a,_2 = a n_            and
                                                           1             1              0)              1
                                                  ay = a       +   [a            +a”         ]    = a        + Gn—1      = An—-1   + Gn-2.

Therefore the recurrence relation for this problem is ad, = @,—,| + Gn—2, where n > 3
                                   and a, = 2, a) = 3. (We leave the details of the solution for the reader.)
                               b) Alternatively, if m > 1 and a, counts the number of binary sequences with no con-
                                   secutive 0’s, then a, = 2 and a2 = 3, and for n > 3 we consider the binary sequences
                                   counted by a,. There are two possibilities for these sequences:
                                        (Case 1: The nth symbol is 1) Here we find that the preceding n — 1 symbols form
                                        a binary sequence with no consecutive 0’s. There are a,_, such sequences.
                                        (Case 2: The nth symbol is 0) Here each such sequence actually ends in 10 and the
                                        first n — 2 symbols provide a binary sequence with no consecutive 0’s. In this case
                                        there are a,—2 such sequences.
                                   Since these two cases cover all the possibilities and have no such sequence in common,
                                   we may write

An   = An,
                                                            + Qn-2,                          n> 3,            a    = 2,            ay = 3,

as we found in part (a).
                                In both part (a) and part (b) we can use the recurrence relation and a; = 2, a) = 3 to
                             go back and determine a value for ay —namely, a9 = a2 ~ a, = 3 — 2 = 1. Then we can
                             solve the recurrence relation

Qn = An—| + An-2;                      n>       2,         a@ = 1,           a, = 2.

Before going any further we want to be sure that the reader understands why a general
                             argument is needed when we develop our recurrence relations. When we are proving a
                             theorem we do not draw any general conclusions from a few (or even, perhaps, many)
                             particular instances. The same is true here. The following example should serve to drive
                             this point home.

We start with   identical pennies and let a, count the number of ways we can arrange these
      EXAMPLE 10.17          pennies— contiguous in each row where each penny above the bottom row touches two
                             pennies in the row below it. (In these arrangements we are not concerned with whether any
                   10.2. The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients                                  463

given penny is heads up or heads down.) In Fig. 10.7 we have the possible arrangements
                  for 1 <n <6. From this it follows that

a;   = 1,         ay    =   1,   a3   = 2,          ag   = 3,         as   = 5,            and     de    = 8.

Consequently, these results might suggest that, in general, a, = F,, the nth Fibonacci
                  number. Unfortunately, we have been led astray, as one finds, for example, that

a7   = 12   #13        = Fy,     ag   =   18   #   21 = Fy,     and        dg     =   26    #    34 = Fo.

(The arrangements in this example were studied by F. C. Auluck in reference [2].)

(n
                    = 6)
                  Figure 10.7

The last two examples for case (A) show us how to extend the results for second-order
                  recurrence relations to those of higher order.

| EXAMPLE 10.18   Solve the recurrence relation

24n43 = An42 + 2An41 — An,                   n>0O,            ao = 0,              a, = 1,                 ay = 2.

Letting a, = cr" forc, r # Oandn > 0, we obtain the characteristic equation 2r3 — r* —
                  2r+1=0= (2r —- 1) ~ Dv + 1). The characteristic roots are 1/2, 1, and —1, so the
                  solution is a, = cy(1)" + co(—1)" + 03(1/2)" = cy + c2(—1)" + €3(1/2)". [The solutions
                  1, (—1)", and (1/2)” are called linearly independent because it is impossible to express
464         Chapter 10 Recurrence Relations

any one of them as a linear combination of the other two.'] From 0 = ay, 1 = a), and 2 =
                             ay, we derive cy = 5/2, cr = 1/6, cz = —8/3. Consequently, a, = (5/2) + (1/6)(—1)” +
                             (—8/3)(1/2)",n = 0.

For n > 1 we want to tile a 2 X n chessboard using the two types of tiles shown in part (a)
      EXAMPLE 10.19
                             of Fig. 10.8. Letting a, count the number of such tilings, we find that a, = 1, since we can
                             tile a 2 X 1 chessboard (of one column) in only one way — using two 1 X | square tiles.
                             Part (b) of the figure shows us that a2 = 5. Finally, for the 2 X 3 chessboard there are 11
                             possible tilings: (1) one that uses six 1 X 1 square tiles; (11) eight that use three 1 x 1 square
                             tiles and one of the larger tiles; and (iii) two that use two of the larger tiles. When n > 4 we
                             consider the nth column of the 2 < n chessboard. There are three cases to examine:

1) the nth column is covered by two 1 X 1 square tiles — this case provides a, _, tilings;
                                  2) the (x — 1)st and nth columns are tiled with one 1 X 1 square tile and one larger
                                     tile— this case accounts for 4a,_2 tilings; and
                                  3) the (n — 2)nd, (mz — 1)st, and nth columns are tiled with two of the larger tiles
                                                                                                                    — this
                                     results in 2a,_3 tilings.

(a)                           (b)

Figure 10.8

These three cases cover all possibilities and no two of the cases have anything incommon,
                             so
                                        Qn = An—) + 4€n_2 + 2ay_3,           n> 4,          a, =,           a, = 5,          a3 = 11.

The characteristic equation x* — x? — 4x — 2 = Ocan be written as
                             (x + 1)(x? — 2x — 2) = 0, so the characteristic roots are —1, 1 + V3, and 1 — /3. Con-
                             sequently, a, = ci(—1)" +e.(1 + V3)" +o, — V3)", n>1. From 1 =a, = —c, +
                             o(l+73)+ea( — V3), SS a =e, +1                              + V3)? +6301 — V3)*, and 11 =a; =
                             —c) tal           + V3) +03(1 — V3), we have c) = 1,02 = 1/3, and c3 = —1/73. So

ay = (-1)" + 1/3) + V3)" + (-1/V3)0 — V3)",                                an   1.

Case (B): (Complex Roots)
                             Before getting into the case of complex roots, we recall DeMoivre’s Theorem:

(cos
                                                       @ +i sin@)" = cosn@
                                                                       +i sin nd,                           n> 0.

[This is part (b) of Exercise 12 of Section 4.1.]

* Alternatively, the solutions 1, (—1)”, and (1/2)” are linearly independent, because if k1, k2, 3   are real
                             numbers, and k (1) + ko(—1)” + k3(1/2)" = 0 for alin EN, then ky = kz = k3 = 0.
                10.2 The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients          465

Ifz =x +iy €C,z #0, wecan write z = r(cos@ +i sin @), where r = ,/x* + y? and
                (y/x) = tan 6, forx # 0. Ifx = 0, then fory > 0,

Z=   yi = yi sin(/2)    = y(cos(z/2) +i sin(z/2)),

and for y < 0,

z=   yi = |y|f sin(372/2)    = |y|(cos(3z/2) +i    sin(37/2)).

In all cases, z” = r”(cosn@ +7 sin n@), for n > 0, by DeMoivre’s Theorem.

Determine (1 + /3 i)".
EXAMPLE 10.20
                      Figure 10.9 shows a geometric way to represent the complex number 1 + /3 i as the
                point (1, 3) in the xy-plane. Here r = V 1? + (V3)? = 2, and 6 = 77/3.

>

(1,V3)

Figure 10.9

So 1+ V3i = 2(cos(z/3) +i sin(/3)), and

(1 + 73 i)!9 = 2! (cos(102
                                               /3) + 7 sin(107/3)) = 2!°(cos(42/3) + i sin(47/3))
                                    = 2!9((-1/2) — (V/3/2)i) = (-29)U + V3 3).

We’ll use such results in the following examples.

Solve the recurrence relation a, = 2(a@,_| — @,—2), where n > 2 and dp =            1, a; = 2.
EXAMPLE 10.21
                   Letting a, = cr", for c,r #0, we obtain the characteristic equation r? — 2r + 2 =
                0, whose roots are 1 +7. Consequently, the general solution has the form c\({ +7)" +
                c2(1 — i)", where c, and cz presently denote arbitrary complex constants. [As in case (A),
                there are two independent solutions: (1 + 7)” and (1 — i)”.]

142 = V2(cos(7/4) +i sin(r/4))

and

1 —i = V2(cos(—2/4) +i sin(—2/4)) = V2(cos(1/4) — i sin(7/4)).
466         Chapter 10 Recurrence Relations

This yields

Gn = CCL +4)" + eo             — i)"
                                    = ci[/2(cos(/4) +i sin(/4))]" + co[/2(cos(—7/4) +i sin(—m /4))]"
                                    ~ e(/2y" (cos(nz /4) + 7 sin(nz /4)) + c2(/2)" (cos(—nz /4) +i sin(—nz /4))

=   1 (/2)" (cos(nz /4) +i sin(n7/4)) + 67 (/2)" (cos(nz /4) — i sin(nz /4))

= (/2)"[k; cos(nm/4) + kp sin(nr/4)],

where k; = cy +c) and kp = (cy — €2)i.

1 = ay   =   [ky cos 0+    k2   sin 0]       =   ky

2 =a, = V2[1 - cos(2/4) + ko sin(z/4)], or2 =1+k2,                                   and ky = 1.

The solution for the given initial conditions is then given by

dy = (/2)"[cos(nz /4) + sin(n7/4)],                            n>=0.

[Note: This solution contains no complex numbers. A small point may bother the reader here.
                             How did we start with c;, cp complex and end up with k, = cy +c, and ky = (c; — c2)i
                             real? This happens if c), cz are complex conjugates.]

Let us now examine an application from linear algebra.

For b € Rt, consider the n X n determinant’ D,, given by
      EXAMPLE 10.22
                                                         bb  O90                 0            00      0    0         0
                                                         b b b 0                 0            0 0     0    0         0
                                                         0 bbb                   0            00      0    0         0
                                                         0 0 b b                 b            00      0    0         0

000             0       0            b bb    O              0
                                                         00  0           0       0            0b   bb                O
                                                         0 0 0           0       0            0 0 b b                b
                                                         00  0           0       0            0 0 0 b                b

Find the value of D,, as a function of n.
                                Let a,, n > 1, denote the value of the n X n determinant D,,. Then

b       b                         b         b    O
                                        a, = |b| =b-     and      a=),                   =o   (and   a;=       |b        b    b   = —p*)
                                                                                                               0         b    b

"The expansion of determinants is discussed in Appendix 2.
                      10.2. The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients                     467

Expanding D,, by its first row, we have D, =
                            bbO    0 ---  0 0                             0   0        bb      O 0              0    0   0    0
                            bb  b OD ---  0 D0                            0   0        0 b     bO   ---         0    0   0    0
                            O b bb   «.-- 0 0                             0   0        O b     bb:              0    0   0    0
                            ee                                                                    ;
                                  000          0      .::.       b     bb     O        0   0 0      0           b    b   b    0
                                  00  0        0      ::.        0     b b    b        0   00       0    ---    O    b   b    b
                                  00  0        0      .:--:.     0     0 b    b        0   00       0    ---    0    0   b    b

(This is D, - |.)

When we expand the second determinant by its first column, we find that D, = bD,~1 —
                     (b)(b)D,_2 = bD,_~1 — b* Dn_2. This translates into the relation a, = ba,_, — b7a,_2, for
                     n>3,a, =b,a,=0.
                          If we let a, = cr” forc, r # 0 and n > 1, the characteristic equation produces the roots
                     b[(1/2) £iV3/2).
                          Hence

Gy = Cy[b((1/2) + 13/2)" + c2[b(1/2) — 13/21"
                                        = b"[c\(cos(/3)               +2 sin(/3))"   + c2(cos(z/3) — i sin(a/3))"]
                                        = b"[k, cos(nz/3) + kz sin(nz/3)].
                     b = ay = blk, cos(/3) + ky sin(7/3)], so 1 = ky(1/2) + ko(V/3/2), or ky + V3ky = 2.
                     0 = ay = b*[k, cos(27/3) + ky sin(277/3)), so 0 = (k,)(—1/2) + ka(V3/2), or
                                   ky   > J3   k>.

Hence k; = 1, ky = 1//3 and the value of D, is
                                                               b" [cos(nz/3) + (1/73) sin(nz /3)].

Case (C): (Repeated Real Roots)
                     Solve the recurrence relation @,42 = 4dy,4, — 4a,, where n > 0 and ay = 1, a, = 3.
EXAMPLE 10.23   _|        As in the other two cases, we let a, = cr”, where c, r # 0 and n > 0. Then the charac-
                     teristic equation is 77 — 4r + 4 = O and the characteristic roots are both r = 2. (Sor = 2is
                     called “a root of multiplicity 2.”) Unfortunately, we now lack two independent solutions: 2”
                     and 2” are definitely multiples of each other. We need one more independent solution. Let
                     us try g(n)2” where g(n) is not a constant. Substituting this into the given relation yields

g(n +2)2"*? = 4e(n + 1)2"*! — 4g(n)2”
                     or

g(n + 2) = 2g(n + 1) — gin).                                     (1)
                     One finds that g(n) = n satisfies Eq. (1)." So n2” is a second independent solution. (It is
                     independent because it is impossible to have n2” = k2” for all n > Oif k is a constant.)

* actually, the general solution is g(a) = an + b, for arbitrary constants a, b, witha # 0. Here we chosea = |
                     and 6 = 0 to make g() as simple as possible.
468            Chapter 10 Recurrence Relations

The general solution is of the form a, = c,(2”) + e.n(2"). With ay = 1, a; = 3 we find
                                  Gn = 27 + (1/2)n(2") = 27 + n(2"-") nr > 0.

In general, if Cog, + Cyay_y + Co@n—2 ++ ++ + Cyay_x, = 0, with Cp (€ 0), Cy, Cr,
                                    ..., Cx (#0) real constants, and r a characteristic root of multiplicity m, where
                                    2<m       <k, then the part of the general solution that involves the root r has the form

Aor" + Aynr® + Agn?r” + +++ Amin                    tr"
                                                                                                                        fae    +   Ann   )r",
                                                                                   =    (Ao    fe Aft    +   Ann?

where Ap, Aj, Az,..., Am—1 are arbitrary constants.

Our last example involves a little probability.

If a first case of measles is recorded in a certain school system, let p, denote the probability
      EXAMPLE 10.24
                                  that at least one case is reported during the nth week after the first recorded case. School
                                  records     provide evidence that py = pp—1 — (0.25) py_2, where n > 2. Since po = 0 and
                                  pi = 1, if the first case (of a new outbreak) is recorded on Monday, March 3, 2003, when
                                  did the probability for the occurrence of a new case decrease to less than 0.01 for the first
                                  time?
                                     With p, = cr” force, r # 0, the characteristic equation for the recurrence relation is r* 2 —
                                  r + (1/4) = 0 = (r ~ (1/2))?. The general solution has the form p, = (c, + con)(1/2)",
                                  n > 0. For po = 0, p; = 1, we get c;} = 0, cp = 2, 80 p, = n2-"*
                                                                                                 |» -n > 0.
                                     The first integer n for which p, < 0.01 is 12. Hence, it was not until the week of May
                                  19, 2003, that the probability of another new case occurring was less than 0.01.

5. Answer the question posed in Exercise 4 if (a) the motor-
                          EXERCISES 10.2                                 cycles come in two distinct models; (b) the compact cars come
                                                                         in three different colors; and (c) the motorcycles come in two
1. Solve the following recurrence relations. (No final answer
                                                                         distinct models and the compact cars come in three different
should involve complex numbers.)
                                                                         colors.
      a) a, = Sa,_, + 64,2,       n>2,      ag=1,      a, =3
      b) 24,42 — Ildn41    + 5a, =0,      n>O0,     ap =2,     a, = —8    6. Answer the questions posed in Exercise 5 if empty spaces
                                                                         are allowed.
      C) Qn42 +a,   =0,   n>O0,    a9 =0,      a, =3
      d) a, — 6@,-; + 9a,_2 = 0,       n>2,       a9 =5,     a) = 12      7. In Exercise      12 of Section 4.2 we learned that Fy + F, +
                                                                         Fy 4-+++ Fy, = )0"_5 F = Fra — 1. This is one of many
      e) a, + 2a,-) +2a,-2 =0,         n>2,       ag =1,     a, =3
                                                                         such properties of the Fibonacci numbers that were discovered
2. a) Verify the final solutions in Examples 10.14 and 10.23.           by the French mathematician Francois Lucas (1842-1891). Al-
      b) Solve the recurrence relation in Example 10.16.                 though we established the result by the Principle of Mathemat-
  3. If ag = 0, a) = 1, a2 = 4, and a3 = 37 satisfy the recur-           ical Induction, we see that it is easy to develop this formula by
rence relation @,42 + ba,,; + ca, = 0, where n > 0 and b,c               adding the system ofn + | equations
are constants, determine b, c and solve for a,.                                                     Fo = Fy — Fy
4. Find and solve a recurrence relation for the number of ways                                     Py   =   F3—
                                                                                                               Fy
to park motorcycles and compact cars in a row of n spaces if
each cycle requires one space and each compact needs two. (All
cycles are identical in appearance, as are the cars, and we want                                 Fra = Fray — Fr
to use up all the 7 spaces.)                                                                       F,,   =   Fi42   ~    Fy.
                                            10.2. The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients            469

Develop formulas for each of the following sums, and then                        number of ways to stack n of these poker chips so that there are
check the general result by the Principle of Mathematical In-                    no consecutive blue chips.
duction.                                                                         13. An alphabet & consists of the four numeric characters 1,
      a) Fi + Py + Fs +-++++ Fo), where n € Z*                                   2, 3, 4, and the seven alphabetic characters a, b, c, d, e, f, g.
      b) Fo + Fy + Fy +--++ Fo,, where n € Zt                                    Find and solve a recurrence relation for the number of words of
                                                                                 length n (in &=*), where there are no consecutive (identical or
8. a) Prove that
                                                                                 distinct) alphabetic characters.
                           —  Fr     — 1+ V5
                          lim       =         .                                  14. An alphabet © consists of seven numeric characters and
                         noo Fy,         2                                       k alphabetic characters. For n > 0, a, counts the number of
      (This limit has come to be known as the golden ratio and is                strings (in &*) of length n that contain no consecutive (identi-
      often designated by a, as we mentioned in Example 10.10.)                  cal or distinct) alphabetic characters. If @,42 = 7ay,41 + 63a,
      b) Consider a regular pentagon ABCDE inscribed in a cir-                   n > QO, what is the value of k?
      cle, as shown in Fig. 10.10.                                               15. Solve the recurrence relation 42      = Gy41G,,n = 0,do = 1,
            i)   Use the law of sines and the double angle formula               a, =2.,
                 for the sine to show that AC/AX = 2 cos 36°.                    16. For    > 1, let a, be the number of ways to write # as an or-
           ii)   As cos 18° = sin 72°
                                                                                 dered sum of positive integers, where each summand is at least
                 = 4sin 18° cos 18°(1 — 2 sin? 18°) (Why?), show                 2. (For example, as; = 3 because here we may represent 5 by 5,
                 that   sin 18°       is a root   of the   polynomial   equa-
                                                                                 by 2 + 3, and by 3 + 2.) Find and solve a recurrence relation
                 tion 8x7 — 4x + 1 =0, and deduce that sin 18° =                 for a,.
                 (/5 — 1)/4.
                                                                                 17. a) Fora fixed nonnegative integer n, how many composi-
      c) Verify that AC/AX = (1 + J5)/2.                                             tions of 2 + 3 have no 1 as asummand?
                                                                                      b) For the compositions in part (a), how many start with
                                                                                      (1) 2; (li) 3; Gli) k, where 2<k <n+1?
                                                                                      c) How many of the compositions in part (a) start with
                                                                                      n+2orn+3?
                                                                                      d) How are the results in parts (a)—(c) related to the formula
                                                                                      derived at the start of Exercise 7?
                                                                                 18. Determine the points of intersection of the parabola y =
                                                                                 x* — | and the liney = x.
                                  E                D                              19. Find the points of intersection of the hyperbola y = 1 + +
                        Figure 10.10                                             and the line y = x.
                                                                                 20. a) Fora = (1 + /5)/2, show thate? =a + 1.
                                                                                      b) If € Z*, prove that a” = a@F, + F,_).
9, For n > 0, let a, count the number of ways a sequence
of 1’s and       2’s will   sum       to a. For   example,   a3 = 3 because      21. Let F,, denote the mth Fibonacci number, for n > 0, and
(1) 1, 1, 1; (2) 1, 2; and (3) 2, 1 sum to 3. Find and solve a recur-            let « = (1+ J5)/2. For n > 3, prove that (a) F, > a”? and
rence relation for a,.                                                           (b) F, <a"!,
10, For© = {0, 1}, let A C ©*, where A = {00, 1}. Forn > 1,                      22. a) Forn € Z*, let a, count the number of palindromes of
let a, count the number of strings in A* of length n. Find and                       2n. Then a,4) = 2a,,n > 1, a, = 2. Solve this first-order
solve a recurrence relation for a,,. (The reader may wish to refer                    recurrence relation for a,.
to Exercise 25 for Section 6.1.)                                                      b) For n € Z*, let b, count the number of palindromes of
11.   a) For n > 1, let a, count the number of binary strings of                      2n — 1. Set up and solve a first-order recurrence relation
      length n, where there are no consecutive 1’s. Find and solve                    for b,,.
      a recurrence relation for a,.                                              (You may want to compare your solutions here with those given
      b) For n > |, let b, count the number of binary strings of                 in Examples 9.13 and 10,15.)
      length n, where there are no consecutive 1’s and the first                 23. Consider ternary strings — that is, strings where 0, 1, 2 are
      and last bit of the string are not both 1. Find and solve a                the only symbols used. For n > 1, let a, count the number of
      recurrence relation for b,.                                                ternary strings of length n where there are no consecutive 1’s
12. Suppose that poker chips come in four colors — red, white,                   and no consecutive 2’s, Find and solve a recurrence relation
green, and blue.        Find and solve a recurrence          relation for the    for a,.
470           Chapter 10 Recurrence Relations

24. For x > 1, let a, count the number of ways to tilea2 x n               30. For n > 1, let D,, be the following n X n determinant.
chessboard using horizontal (1 X 2) dominoes [which can also                          2    1       0       0     0     :-.     0     0      0     0
be used as vertical (2 X 1) dominoes} and square (2 X 2) tiles.
                                                                                      1    2        1      0     0     :--     0     0      0     0
Find and solve a recurrence relation for a,.
                                                                                           1       2       1     0     :..     0     0      0     0
25. In how many ways can one tile a2 X 10 chessboard using
dominoes and square tiles (as in Exercise 24) if the dominoes                         0    0       0       0     0  +--+.      1     2       +1   =0
come in four colors and the square tiles come in five colors?                         0    0       0       0     0:5.          O      t     2      1
26. Let © = {0, 1}andA = {0, 01, 11} C X*. Forn > 1, leta,                            0    0       0       0     0  +.         0     0      1     2
count the number of strings in A* of length v. Find and solve a            Find and solve a recurrence relation for the value of D,,.
recurrence relation for a,.
27. Let & = {0, j}andA = {0, 01, O11, 111} C U*. Porn > 4,                 31. Solve the recurrence relation a?,,—5az,, n+]
                                                                                                                        2
                                                                                                                            + 4a? =0,
let a, count the number of strings in A* of length n. Find and             wheren > 0 and apy = 4, ay = 13.
solve a recurrence relation for a,.                                        32. Determine the constants b and c if a, = c) + .¢2(7"),n > 0,
28. Let © = {0,1} and A = {0, 01,011, 0111, 1111} Cc &*.                   is the general solution of the relation a,42+ bay.) + ca, =
For n > 1, let a, count the number of strings in A* of length n.           O,n>0.
Find and solve a recurrence relation for a,,.
                                                                           33. Prove that any two consecutive Fibonacci numbers are rel-
29. A particle moves horizontally to the right. For n € Z*, the            atively prime.
distance the particle travels in the (7 + 1)st second is equal to
twice the distance it travels during the nth second. If x,, n > 0,         34. Write a computer program (or develop an algorithm) to
denotes the position of the particle at the start of the (n + 1)st         determine whether a given nonnegative integer is a Fibonacci
second, find and solve a recurrence relation for x,,, where x9 = 1         number.
and x; = 5.

10.3
               The Nonhomogeneous
                Recurrence Relation
                               We now turn to the recurrence relations

an + Ciay-1 = fn),                            n>1,                                    (1)

Qn + Cyay—4          + C2an-2       =   f(n),              n> 2,                            (2)

where C, and C) are constants, C; # 0 in Eq. (1), C2 #0, and f(n) is not identically 0.
                               Although there is no general method for solving all nonhomogeneous relations, for certain
                               functions f(n) we shall find a successful technique.
                                   We start with the special case for Eq. (1), when C) = —1. For the nonhomogeneous
                               relation dy, — Gn—) = f(n), we have

a; =a + f())
                                                ay =a, + f(2) =aot+ fl) + fQ)
                                                a3 =a. + f(3) =a9+ fC) + fQ) + FG)

Gn   =n.    +   f(n)   =   ay    +   fF)   +-+-         +f)           =aot>-        fli).
                                                                                                                              i=]

We can solve this type of relation in terms of n, if we can find a suitable summation
                               formula for }7"_, f (i).
                                                          10.3 The Nonhomogeneous Recurrence Relation           471

Solve the recurrence relation a, — d,_|     = 3n”, where n > 1 and ap = 7.
EXAMPLE 10.25
                   Here f(n) = 3n’, so the unique solution is
                                           n                    n               1

an = ao + 9) FD =T+3 QP =T4+ slain
                                                             + Qn + I).
                                          i=]                 i=]

When a formula for the summation is not known, the following procedure will handle
                Eq. (1) for certain functions f(m), regardless of the value of C; (# 0). It also works for
                the second-order nonhomogeneous relation in Eq. (2) — again, for certain functions f(n).
                Known as the method of undetermined coefficients, it relies on the associated homogeneous
                relation obtained when f (#) is replaced by 0.
                   For either of Eq. (1) or Eq. (2), we let a” denote the general solution of the associated
                homogeneous relation, and we let a\?? be a solution of the given nonhomogeneous relation.
                The term a”    is called a particular solution. Then a, = a) 4 a\”) is the general solution
                of the given relation. To determine a‘?     we use the form of f (n) to suggest a form for av?)

Solve the recurrence relation a, — 3a@,-;     = 5(7"), where n > 1 and ay = 2.
EXAMPLE 10.26
                   The solution of the associated homogeneous relation is a” = c(3"). Since f(n) = 5(7"),
                we seek a particular solution a{”) of the form A(7”). As ai”? is to be a solution of the
                given nonhomogeneous relation, we place a?               = A(7") into the given relation and find
                that A(7") — 3A(7"-!) = 5(7"), n > 1. Dividing by 7”~', we find that 7A — 3A = 5(7), so
                A = 35/4, and al”      = (35/4)7" = (5/4)7"t!, n > 0. The general solution is a, = c(3”) +
                (5/4)7"*), With2 = ag = c + (5/4)(7), it follows that c = —27/4 anda, = (5/4)(7"*!) —
                (1/4)(3"**), n > 0.

Solve the recurrence relation a, — 3a,_, = 5(3"), where n > 1 and ay = 2.
EXAMPLE 10.27
                   As in Example 10.26, a     = c(3"), but here a” and f (n) are not linearly independent.
                As a result we consider a particular solution ak” of the form Bn(3"). (What happens if we
                substitute a,"(p) = B(3") into the given relation?)
                   Substituting a” = Bn3" into the given relation yields

Bn(3") — 3B(n — 1)(""')
                                         = 5(3"),                   or   Bn—Bin-1)=5,            so     B=S.,
                   Hence a, = al       + al?    = (c + 5n)3", n > 0. With ao = 2, the unique solution is a, =
                (2 + 5n)(3"), n > 0.

From the two preceding examples we generalize as follows.

Consider the nonhomogeneous first-order relation
                                                       Gn + Cyan, = kr”,
                 where k is a constant and n € Z*. If r” is not a solution of the associated homogeneous
                 relation

Ay, + Cya,-1     = 0,

then a,’{p) = Ar”, where A is a constant. When r” is a solution of the associated homo-
                 geneous relation, then a” = Bnr", for B a constant.
472         Chapter 10 Recurrence Relations

Now consider the case of the nonhomogeneous second-order relation
                                                                 Ay + Cydy—1 + Coan = kr",
                               where k is a constant. Here we find that

a) at” = Ar", for A aconstant, if r” is not a solution of the associated homogeneous
                                      relation;
                                   b) a” = Bnr", where B is a constant, if a”nt = cr" + cor”,1 where r; # rz and
                                   c) a!” = Cn?r", for C a constant, when a® = (cy + e2n)r".

The Towers of Hanoi. Consider n circular disks (having different diameters) with holes in
      EXAMPLE 10.28
                             their centers. These disks can be stacked on any of the pegs shown in Fig. 10.11. In the
                             figure, n = 5 and the disks are stacked on peg 1 with no disk resting upon a smaller one.
                             The objective is to transfer the disks one at a time so that we end up with the original stack
                             on peg 3. Each of pegs 1, 2, and 3 may be used as a temporary location for any disk(s), but
                             at no time are we allowed to have a larger disk on top of a smaller one on any peg. What is
                             the minimum number of moves needed to do this for n disks?

Figure 10.11

For n > 0, let a, count the minimum      number of moves it takes to transfer n disks from
                             peg | to peg 3 in the manner described. Then, for n + 1 disks we can do the following:

a) Transfer the top n disks from peg 1 to peg 2 according to the directions that are given.
                                  This takes at least a, moves.
                               b) Transfer the largest disk from peg 1 to peg 3. This takes one move.
                               c) Finally, transfer the n disks on peg 2 onto the largest disk, now on peg 3 — once again
                                  following the specified directions. This also requires at least a, moves.
                                Consequently, at this point we know that a, ,; is no more than 2a, + 1— that is, aj41 <
                             2a, + 1. But could there be a method       where we actually have a,4)   < 2a, + 1? Alas, no!
                             For at some point the largest disk (the one at the bottom of the original stack   — on peg 1)
                             must be moved to peg 3. This move requires that peg 3 has no disks on it. So this largest
                             disk may only be moved to peg 3 after the n smaller disks have moved to peg 2 [where they
                             are stacked in increasing size from the smallest (on the top) to the largest (on the bottom)].
                             Getting these n smaller disks moved, accordingly, requires at least a,, moves. The largest
                                                       10.3 The Nonhomogeneous Recurrence Relation           473

disk must be moved at least once to get it to peg 3. Then, to get the n smaller disks on top
                of the largest disk (all on peg 3), according to the requirements, requires at least a, more
                steps. SO dn41 = Gn +1 +4, = 2a, +1.
                    With 2a, + 1 < Gy4) < 2a, + 1, we now obtain the relation a,4; = 2a, + 1, wheren >
                0 and ap = 0.
                    For Gn41 — 2a, = 1, we know that al") = c(2”). Since f(n) = 1 = (1)” is nota solution
                Of Gn41 — 2a, = 0, we set at? _ A(1)”" = A and find from the given relation that A =
                2A+1,so A = —1 anda, = c(2”) — 1. From ay = 0 = c — 1 it then follows that c = 1,
                SO d, = 27 —1,n>0.

The next example arises from the mathematics of finance.

Pauline takes out a loan of S dollars that is to be paid back in 7 periods of time. If r is the
EXAMPLE 10.29
                interest rate per period for the loan, what (constant) payment P must she make at the end
                of each period?
                    We let a, denote the amount still owed on the loan at the end of the nth period (following
                the mth payment). Then at the end of the (n + 1)st period, the amount Pauline still owes on
                her loan is a, (the amount she owed at the end of the nth period) + ra, (the interest that
                accrued during the (n + I)st period) — P (the payment she made at the end of the (nm + 1)st
                period), This gives us the recurrence relation

Qn41 = Gn tran     — P,       O<n<T-l1,                ay = S,       ar
                                                                                                = 0.

For this relation a     = c(1 +r)", while qi?       = A since no constant is a solution of the
                associated homogeneous      relation. With al?     = A we    find A-—(1+r)A=-—P,          so A=
                P/r. From ao = S, we obtaina, = (S —(P/r)(               +r)" + (P/r),O0<n <T.
                   Since 0 = ar = (S —(P/r)) +r)! + (P/r), it follows that
                           (P/r)=(P/r)—S)\i+r)?                  and     P= (Sr)fl—-d4+ry 7].

We now consider a problem in the analysis of algorithms.

For n > 1, let S be a set containing 2” real numbers.
EXAMPLE 10.30
                   The following procedure is used to determine the maximum and minimum elements of
                S. We wish to determine the number of comparisons made between pairs of elements in $
                during the execution of this procedure.
                   If a, denotes the number of needed comparisons, thena,        = 1. Whenn    = 2,|S| = 2? =4,
                so S = {x1, X2, V1, yo} = Sy; U Sp where S; = {x1, x2}, So = {y1, yo}. Since a; = 1, it takes
                one comparison to determine the maximum and minimum elements in each of S$}, Sp.
                Comparing the minimum elements of S$; and S) and then their maximum elements, we
                learn the maximum and minimum elements in S and find that ay = 4 = 2a; + 2. In general,
                if |S| = 2"+!, we write § = S, US) where |S,| = |S2| = 2”. To determine the maximum
                and minimum elements in each of S; and Sz requires a, comparisons. Comparing the
                maximum (minimum) elements of S$, and S2 requires one more comparison; consequently,
                Gn41   = 24, +2,n   > 1.
                  Here a = ¢(2") and al? = A, aconstant. Substituting al? into the relation, we find that
                A=2A+2, or A = —2. Soa, = c2” — 2, and with a; = 1 = 2c — 2, we obtainc = 3/2.
                Therefore a, = (3/2)(2”) — 2.
474          Chapter 10 Recurrence Relations

A note of caution! The existence of this procedure, which requires (3/2)(2”) — 2 com-
                              parisons, does not exclude the possibility that we could achieve the same results via another
                              remarkably clever method that requires fewer comparisons.

An example on counting certain strings of length 10, for the quaternary alphabet © =
                              {O, 1, 2, 3}, provides a slight twist to what we’ ve been doing so far.

For the alphabet © = {0, 1, 2, 3}, there are 4'° = 1,048,576 strings of length 10 (in D"°,
      EXAMPLE 10.31
                              or &*). Now we want to know how many of these more than | million strings contain an
                              even number of 1’s.
                                   Instead of being so specific about the length of the strings, we will start by letting a,
                              count those strings among the 4” strings in ©” where there are an even number of 1’s. To
                              determine how the strings counted by a,, for n > 2, are related to those counted by a,_1,
                              consider the nth symbol of one of these strings of length n (where there is an even number
                              of 1’s). Two cases arise:

1) The nth symbol is 0, 2, or 3: Here the preceding n — 1 symbols provide one of the
                                      strings counted by a,_1. So this case provides 3a,_ of the strings counted by ap.
                                   2) The nth symbol is 1: In this case, there must be an odd number of 1’s among the first
                                      n — 1 symbols. There are 4"~! strings of length n — 1 and we want to avoid those
                                      that have an even number of 1’s — there are 4”~! — a,_, such strings. Consequently,
                                         this second case gives us 4"-l_@q__,                                       of the strings counted by a,.

These two cases are exhaustive and mutually disjoint, so we may write

Gn     = 3An-1                  + (qr!     — @n—1)   =   2an-1   + qr!         n>    2,

Here a = 3 (for the strings 0, 2, and 3). We find that a                                                  = c(2") and af? = A(4"~}).
                              Upon substituting a\?) into the above relation we have A(4"~!) = 2A(4"-7) +. 4")                                                      so
                              4A =2A+4 and A = 2. Hence, a, = c(2") + 2(4"~!), n > 2. From 3 = ay = 2c +2 it
                              follows that c = 1/2, so ay = 2"-' + 2(4"7!),n > 1.
                                 When n = 10, we learn that of the 4!° = 1,048,576 strings in D!°, there are 2? +
                              2(4’) = 524,800 that contain an even number of 1’s.
                                   Before continuing we realize that the answer here for a, can be checked by using the
                              exponential generating function f(x) = )(Poy an C (where ao = 1). From the techniques
                              developed in Section 9.4 we have

_(,                         x?                          |         x?    x4             1        x?                    x?
                          f(x) =                        tatytoojdtatar sitar                                                       yt              tats       tee

x               ev   te™                x             x
                                     e              .   _
                                                               7         .   e         .
                                                                                           €

_ (;) ot                         (5)] ox

(VW)           SS 4x"                   (1)            Be 2x)"
                                   -()>                 n=0
                                                                7 +G)d                             n=0
                                                                                                          a
                              Here a, = the coefficient of x in f(x) = (5) 4" + (5) 2" = 277! + 2(4""), as above.
                                                        10.3. The Nonhomogeneous Recurrence Relation          475

In 1904, the Swedish mathematician Helge von Koch (1870-1924) created the intriguing
EXAMPLE 10.32
                curve now known as the Koch “snowflake” curve. The construction of this curve starts with
                an equilateral triangle, as shown in part (a) of Fig. 10.12, where the triangle has side 1,
                perimeter 3, and area /3/4. (Recall that an equilateral triangle of side s has perimeter 3s
                and area s?./3/4.) The triangle is then transformed into the Star of David in Fig. 10.12(b)
                by removing the middle one-third of each side (of the original equilateral triangle) and
                attaching a new equilateral triangle whose side has length 1/3. So as we go from part (a)
                to part (b) in the figure, each side of length 1 is transformed into 4 sides of length 1/3,
                and we get a 12-sided polygon of area (/3/4) + (3)(/3/4)(1/3)? = ¥3/3. Continuing
                the process, we transform the figure of part (b) into that of part (c) by removing the middle
                one-third of each of the 12 sides in the Star of David and attaching an equilateral tri-
                angle of side 1/9 (= (1/3)*). Now we have [in Fig. 10.12(c)] a 47 (3)-sided polygon whose
                area is

(V3/3) + (4)3)(V3/4)[(1/3)°7? = 10V3/27.

(a)                             (b)                               (¢)

Figure 10.12

For n > 0, let a, denote the area of the polygon P, obtained from the original equilateral
                triangle after we apply n transformations of the type described above [the first from P
                in Fig. 10.12(a) to P; in Fig. 10.12(b) and the second from P; in Fig. 10.12(b) to P; in
                Fig. 10.12(c)]. As we go from    P, (with 4”(3) sides) to P,41 (with 4”+'(3) sides), we find
                that

Gny1 = Gn + (4"(3))(V3/4)(1/3"9!)? = an + (1/(4V3))(4/9)"
                because in transforming P, into P,,,; we remove the middle one-third of each of the 4”(3)
                sides of P,, and attach an equilateral triangle of side (1/3”*').
                    The homogeneous part of the solution for this first-order nonhomogeneous recurrence
                relation is al) = A(1)” = A. Since (4/9)” is not a solution of the associated homoge-
                neous relation, the particular solution 1s given by ay?) = B(4/9)", where B is a constant.
                Substituting this into the recurrence relation a@)41 = dy + (1/(4V3))(4/9)", we find that
                B = (-9/5)(1/(4/3)). Consequently,
                         ay, = A + (—9/5)(1/(4V3))(4/9)" = A — (1/(5V3))(4/9)""!,                  an = 0.
                Since /3/4 = ay = A — (1/(5V3))(4/9)—|, it follows that A = 6/(5/3) and
                        an = (6/(5V3)) — 1/(5V3))(4/9)""! = 1/6V3))16 — 4/9",                          n=O.
476         Chapter 10 Recurrence Relations

[Asn grows larger, we find that (4/9)"—! tends to 0 and a, approaches 6/(5./3). We can also
                             obtain this value by continuing the calculations we had before we introduced our recurrence
                             relation, thus noting that this limiting area is also given by

(73/4) + (V3/4)(3)(1/3)? + (V3/4)(4)
                                                             GB) 1/32)? + (V3 /4)(42)(3)(1/33)? +                                     -
                                        = (V3/4) + (V3/4)(3) 0 4" (1/3"tty? = (3/4) + (1/(4V3)) $0479)"
                                                                         n=(0)                                               n=0

= (73/4) + (1/(4V3))E1/C — (4/9))] = (73/4) + (1/(4-V3)) (9/5) = 6/(5V3)
                             by using the result for the sum of a geometric series from part (b) of Example 9.5.]

For n > 1, let X, = {1, 2,3,...,n};                  PCX,)   denotes    the power     set of X,,. We    want to
      EXAMPLE 10.33
                             determine a, the number of edges in the Hasse diagram for the partial order (P(X,,), C).
                             Here a, = 1 and a2 = 4, and from Fig. 10.13 it follows that

a3 = 2a, +2’.

{2, 3}

Figure 10.13

This is because the Hasse diagram for (P(X3), C) contains the a2 edges in the Hasse di-
                             agram for (P(X2), C) as well as the a) edges in the Hasse diagram for the partial order
                             ({{3}, €1, 3}, {2, 3}, {1, 2, 3}}, ©).        [Note the identical structure shared by the partial or-
                             ders (P({1, 2}), ) and ({{3}, {1, 3}, {2, 3}, {1, 2, 3}}, C).] In addition, there are 2? other
                             (dashed) edges — one for each subset of {1, 2}. Now forn > 1, consider the Hasse diagrams
                             for the partial orders (P(X,,), C) and ({T U {mn + 1}|T € P(X,,)}, C). Foreach S € P(X,),
                             draw an edge from S$ in (P(X,,), C) toS U {nm + lyin ({7 U{n + 1}|T € P(X, )}, C). The
                             result is the Hasse diagram for (P(X,,41), C). From the construction we see that

Gn41    =   2a,     + 2",          n>,           a,   =1.

The solution to this recurrence relation, with the given condition a, = 1, is a, = n2"7',
                             n>.

Each of our next two examples deals with a second-order relation.

Solve the recurrence relation
      EXAMPLE 10.34
                                     An+2 — 4dn41 + 3a, = —200,                      n > 0,         ao = 3000,          a, = 3300.
                                                                 10.3. The Nonhomogeneous Recurrence Relation                      477

Here a     = ¢,(3") +¢2.(1") = ¢1(3") + cp. Since f(n) = —200 = —200(1") is a solution
                      of the associated homogeneous      relation, here al? —                An    for some constant A. This leads
                      us to

A(n + 2) — 4A(n + 1) +3An          = —200,             so      —2A        = —200,        A = 100.

Hence a, = c)(3") +c¢2 + 100n.              With     ay = 3000           and     a, = 3300,   we   have   a, =
                      100(3”) + 2900 + 100n, n > 0.

Before proceeding any further, a point needs to be made about the role of technology in
                      solving recurrence relations. When a computer algebra system is available, we are spared
                      much of the drudgery of computation. Consequently, all our effort can be directed to analyz-
                      ing the situation at hand and setting up the recurrence relation with its initial condition(s).
                      Once this is done our job is just about finished. A line or two of code will often do the trick!
                      For example, the Maple code in Fig. 10.14 shows how one can readily solve the recurrence
                      relations of Examples 10.33 and 10.34.

- >     rsolve({a(n+1)=2%*a(n)+24n,a(1)=1},a(n));

2"         2                  ,
                                                         —~—+]         74742
         L                                                 2           2  2
             >   simplify (%);
                                                                     (n-1)
                                                                 2            n
             > rsolve({a(n+2)=4*a(n+1)+3*a(n) =-200,a(0)=3000,a(1)=3300},a(n));
                                                        100 3”+ 2900 + 100 n

Figure 10.14

In part (a) of Fig. 10.15 we have an iterative algorithm (written as a pseudocode procedure)
EXAMPLE 10.35
                      for computing the nth Fibonacci number, for n > 0. Here the input is a nonnegative integer
                      n and the output is the Fibonacci number               F,,. The variables i, fib, last, next_to_last, and
                      temp are integer variables. In this algorithm we calculate F,, (in this case for n > Q) by first
                      assigning or computing all of the previous values Fo, F\, F2,..., F,—1. Here the number
                      of additions needed to determine F,, is 0 for n = 0, 1 and n — 1 (within the for loop) for
                      n> 2,
                          Part (b) of Fig. 10.15 provides a pseudocode procedure to implement a recursive algo-
                      rithm for calculating F,, for n € N. Here the variable fib is likewise an integer variable. For
                      this procedure we wish to determine a,,, the number of additions performed in computing
                      F,,n > 0. We find that dg = 0, a; = 0, and from the shaded line in the procedure — namely,

fib   := FibNum2(n-1)                  + FibNum2(n             - 2)                       (*)

we obtain the nonhomogeneous recurrence relation

Qn   = An—|       + an-2       + 1,            n> 2,

where the summand of | is due to the addition in Eq. (*).
478   Chapter 10   Recurrence Relations

procedure            FibNumi(n:               nonnegative              integer)
                                                    begin
                                                        if n= 0 then
                                                           fib :=
                                                        elseifn=1then
                                                           fib :=
                                                        else
                                                           begin
                                                              last :=1
                                                              next_to last :=0
                                                              for i:=2 tondo
                                                                 begin
                                                                          temp         := last
                                                                      last := last +next_to last
                                                                      next to last := temp
                                                                   end
                                                                fib :=last
                                                             end
                                                    end                                                                                    (a)

procedure FibNum2(n:                          nonnegative              integer)
                                                    begin
                                                       ifn=0 then
                                                          fib :=
                                                        else       if n=1        then
                                                             fib    :=
                                                        else
                                                             fib    := FibNum2(n-1)                          + FibNum2(n            - 2)
                                                    end                                                                                    (b)

Figure 10.15

Here we find that a” = c, (i4v5)" + C2 (Ke5)" and that a\”= A, a constant. Upon
                        substituting a,'’(p);into the nonhomogeneous recurrence relation we find that

A=A+A+l1,

so A = —1 anda, = ce, (14x8)" + c(15)"                                        — 1,
                           Since ap = 0 and a, = 0 it follows that

1             5             1-5
                                                  ¢) +c2      =1         and      a(              SS)         +0                By ax
                                                                                                  2                         2

From these equations we learn that c; = (1 + /5)/(2/5), c2 = (V5 — 1)/(2V5). There-
                        fore,

7                 (LtNS              1+V5\"_                      (1-vs\          (1-v5\"_,
                                              "              2/5                  2                          2/5                2
                                                  {|

1        14/5           nt+1          1             1-5      atl        ;

2                         5 \             2             a
                                                  \

cal
                                                 10.3 The Nonhomogeneous Recurrence Relation                479

As    n    gets   larger    [U1 ~— J/5)/2)"*!     approaches     0    since   |(1 — J5)/2|    <1,     and   a, =
(1/V5)[(1 + /5)/2]"*! = (1 + V5)/(25))(Cd + V5) /2)".
    Consequently, we can see that, as the value of 7 increases, the first procedure requires
far less computation than the second one does.

We now summarize and extend the solution techniques already discussed in Examples
10.26 through 10.35.
   Given a linear nonhomogeneous recurrence relation (with constant coefficients) of the
form Coan + Can—| + CrQn—2 +--+                  + Cyan_x = f (n), where Co            Oand C, ¥ 0, leta”
denote the homogeneous part of the solution ay.

1) If f(@) is a constant multiple of one of the forms in the first column of Table 10.2
          and is not a solution of the associated homogeneous relation, then a?                     has the form
          shown in the second column of Table 10.2. (Here A, B, Ap, Ai, A2,..., Ar—1, Ay
          are constants determined by substituting a”) into the given relation; ¢, r, and @ are
          also constants.)

Table 10.2
                                                                     al?)

c, a constant         A, a constant
                     n                     Ain    + Ag
                     n°                    Ajn*    + Ayn    +   Ao
                     nteZ                  Ayn
                                            + Ayn                    +--+ Ayn + Ag
                     r" reR                Ar”
                     sin 6n                AsinO@n+        Bcos@n
                     cos On                Asin@n        + B cos @n
                     nir?                  r"(A,n'    + A,-\n'7!        +.---+Ajn+      Ag)
                     r” sin On             Ar” sin6@n + Br" cos én
                     r™ cos @n             Ar” sin@n + Br" cos @n

2) When      f(x) comprises a sum of constant multiples of terms such as those in the
          first column      of the table for item (1), and none of these terms is a solution of the
          associated homogeneous relation, then al? ) is made up of the sum of the corresponding
          terms in the column headed by a\?), For example, if f(m) = n*? +3 sin 2n and no
          summand of f (7) is a solution of the associated homogeneous relation, then a\” y=
          (Ann? + Ain + Ag) + (A sin 2n + B cos 2n).
     3) Things get trickier if a summand f}() of f (7) is a solution of the associated homo-
        geneous relation. This happens, for example, when f (n) contains summands such as
        cr” or (c; + c2n)r” and r is a characteristic root. If f|(n) causes this problem, we
          multiply the trial solution (al?       1 corresponding to f;(#) by the smallest power of n,
          say n°, for which no summand of n* f; (7) is a solution of the associated homogeneous
          relation. Then n* (ap”’), is the corresponding part of at”.
   In order to check some of our preceding remarks on particular solutions for nonhomo-
geneous recurrence relations, the next application provides us with a situation that can be
solved in more than one way.
480         Chapter 10 Recurrence Relations

For n > 2, suppose that there are n people at a party and that each of these people shakes
      EXAMPLE 10.36
                             hands (exactly one time) with all of the other people there (and no one shakes hands with
                             himself or herself). If a, counts the total number of handshakes, then

GAn+1   =a,    +h,          n> 2,           a2   = l,                    (3)

because when       the (x + 1)st person arrives, he or she will shake hands                with the n other
                             people who have already arrived.
                                According to the results in Table 10.2, we might think that the trial (particular) solution
                             for Eq. (3) is Ain + Ao, for constants Ap and Aj. But here the associated homogeneous
                             relation is Gn41 = Gn, OF Gn41 — Gn = 0, for which a                       = c(1") = c, where c denotes an
                             arbitrary constant. Therefore, the summand Ao (in Ayn + Ag) is a solution of the associated
                             homogeneous relation. Consequently, the third remark (given with Table 10.2) tells us that
                             we must multiply A; + Ag by the smallest power of n for which we no longer have any
                             constant summand. This is accomplished by multiplying A,n + Ag by n', and so we find
                             here that

a”?    =   Ayn?   +   Agn.

When we substitute this result into Eq. (3) we have

Ay(n + 1)’ + Ao(n + 1) = Ain? + Aon +7,
                             or Ayn? + (2A, + Ao)n + (A, + Ag) = Ayn? + (Ag + Dn.
                                By comparing the coefficients on like powers of n we find that

(n?):     Ay = Ay;
                                (n):      2A,   + Ap   =    Ap + 1; and

(n°):     A, + Ay = 0.
                            Consequently, Ay = 1/2 and Ap = —1/2, so ay” = (1/2)n? + (-1/2)n and a, =a” +
                            ai? = c+ (1/2)(a)(n — 1). Since a2 = 1, it follows from 1 = az = c + (1/2)(2)(1) that
                             c= 0, anda, = (1/2)(n)(n — 1), forn > 2.
                                We can also obtain this result by considering the n people in the room and realizing that
                             each possible handshake corresponds with a selection of size 2 from this set of size n — and
                             there are (3) = (n!)/(2!(n — 2)!) = (1/2)(n)(n — 1) such selections. [Or we can consider
                             the n people as vertices of an undirected graph (with no loops) where an edge corresponds
                             with a handshake. Our answer is then the number of edges in the complete graph K,,, and
                            there are (5) = (1/2)(n)(n — 1) such edges.]

Our last example further demonstrates how we may use the results in Table 10.2.

EXAMPLE 10.37 |          a) Consider the nonhomogeneous recurrence relation

An42 — 10an41 + 2la, = f(n),                      n= 0.
                                  Here the homogeneous part of the solution is

an” = cB") +02(7"),
                                  for arbitrary constants c),        ¢>.
                                        In Table 10.3 we list the form for the particular solution for certain choices of f(n).
                                  Here the values of the 11 constants A,, for 0 <i < 10, are determined by substituting
                                  a,?? into the given nonhomogeneous recurrence relation.
                                                                                     10.3 The Nonhomogeneous Recurrence Relation                         481

Table 10.3

f(n)                                an
                                                                        5                            Ao
                                                                   3n? —2                            Ajn*  + Ann + Ay
                                                                    7(11")                           Aq(11")
                                                              31(r"), r # 3,7                        As(r”)
                                                                     6(3")                           Agn3"
                                                               2(3”) — 8(9")                         Aqn3" + Ag(9")
                                                               4(3") + 3(7")                         Agn3" + Ajgn7"

b) The homogeneous component of the solution for
                                                                   Ayn + 4ay-)            + 4a,-2    = fn),            n>2

1S

ay = c1(—2)" + en(—2)",
                                         where c), c2 denote arbitrary constants. Consequently,

1) if f(n) = 5(—2)”, then al” = An2(—2)";
                                         2) if f(n) = 7n(—2)", then al” = n2(—2)"(Aqn + Ay); and
                                         3) if f(n) = —11n2(—2)", then al” = n2(—2)"( Ban? + Bin + Bo).                                                      oy
                                              (Here, the constants A, Ag, Ai, Bo, B,, and B                   are determined by substituting a, Pp
                                              into the given nonhomogeneous recurrence relation.)

5. Solve the following recurrence relations.
                                                                                     4) Gn42 + 3dny1       + 2aq    = 3",      n>0,          a =0,   a, =1
1. Solve each of the following recurrence relations.                                b) Gyi2 + 4an41 + 4a, = 7,               n>0,       ag=1,       a, =2
    a) Gn41 —@,    = 2n +3,      n>O0,        a=]                               6. Solve       the     recurrence     relation       a@,4. — 64,4; + 9a, =
    b) Qna) —@,    =3n?-—n,        n>O0,        a =3                       3(2") + 7(3"), where n > 0 and ay = 1, a) = 4.
    ©) Gnu: — 2, =5,        n>0,        ay =i                                   7. Find the general solution for the recurrence relation
                                                                           An+3 — 3@n42 + 3Qn4) — Gp =34+5n,n                         > 0.
    d) a,4; — 2a, = 2",       n>O0,      a=!
                                                                                 8. Determine the number of n-digit quaternary (0, 1, 2, 3)
2. Use a recurrence relation to derive the formula for }>" yi”.               sequences in which there is never a 3 anywhere to the right
                                                                               of a0.
3. a) Let n lines be drawn in the plane such that each line                 9, Meredith borrows $2500, at 12% compounded monthly, to
    intersects every other line but no three lines are ever co-            buy a computer. If the loan is to be paid back over two years,
    incident. For n > 0, let a, count the number of regions into           what is his monthly payment?
        ich the plane isseparated by the ” lines. Find and sol                                    .                           .
    which the plane is separated    bythe 7 lines.   Find and solve        10. The general solution of the recurrence relation a,4. +
    a recurrence relation for a,.
                                                                           b1An41 + Dod, = b3n + by, n > 0, with b, constant for 1 <i <
    b) For the situation in part (a), let b, count the number              4, is c,;2” + 623" +n — 7. Find b, foreach 1 <i <4.
    of infinite regions that result. Find and solve a recurrence
                                                                               11. Solve the following recurrence relations.
    relation for b,.
                                                                                     a)    ae .5 — 5a? , + 6a? =Tn,            n>O,          a=a,=1
4. On the first day of a new year, Joseph deposits $1000 in
an account that pays 6% interest compounded monthly. At the                          b) a? —2a,-;=0,               n>=1,      ay=2       (Let    b, = log, dy,
beginning of each month he adds $200 to his account. If he                           n> 0.)
continues to do this for the next four years (so that he makes                 12.   Let © = {0, 1, 2, 3}. Forn            > 1, let a, count the number of
47 additional deposits of $200), how much will his account be                  strings in &” containing an odd number of 1’s. Find and solve
worth exactly four years after he opened it?                                   a recurrence relation for a,.
482            Chapter 10 Recurrence Relations

oO
                                      (n= 1)                   (n = 2)
                                     Figure 10.16

13. a) For the binary string 001110, there are three runs: 00,                 mula given in Example 4.5 or with the result requested in
      111, and 0. Meanwhile, the string 000111 has only two                    part (b) of Exercise 8 of Section 9.5.]
      runs: 000 and 111; while the string 010101 determines the                b) In an organic laboratory, Kelsey synthesizes a crys-
      six runs: 0, 1,0, 1, 0, 1. Form = 1, we consider two binary              talline structure that is made up of 10,000,000 triangular
      strings, namely, 0 and 1— these two strings (of length 1)                layers of atoms. The first layer of the structure has one
      determine a total of two runs. There are four binary strings             atom, the second layer has three atoms, and, in general, the
      of length n = 2 and these strings determine 1 (for 00) + 2               nth layer has 1+2+---+n =, atoms. (Consider each
      (for 01) + 2 (for 10) + 1 (for 11) = 6 runs. Find and solve              layer, other than the last, as if it were placed upon the spaces
      a recurrence relation for t,, the total number of runs deter-            that result among the neighboring atoms of the succeeding
      mined by the 2” binary strings of length n, where n > 1.                 layer. See Fig. 10.16.) (i) How many atoms are there in
      b) Answer the question posed in part (a) for quaternary                  one of these crystalline structures? (ii) How many atoms
      strings of length n. (Here the alphabet comprises 0, 1, 2, 3.)           are packed (strictly) between the 10,000th and 100,000th
      c) Generalize the results of parts (a) and (b).                          layer?

14, a) For 2 > 1, the ath triangular number t, is defined by               15. Write a computer program (or develop an algorithm) to
    t =1+2+---+n=xn(n+4+ 1)/2. Find and solve a re-                        solve the problem of the Towers of Hanoi. For n € Z*, the pro-
    currence relation for s,,” > 1, wheres, = ¢t, +t +---+                 gram should provide the necessary steps for transferring the n
    t,, the sum of the first n triangular numbers. [The reader             disks from peg | to peg 3 under the restrictions specified in
    may wish to compare the result obtained here with the for-             Example 10.28.

10.4
      The Method of Generating Functions
                                 With all the different cases we had to consider for the nonhomogeneous linear recurrence
                                 relation, we now get some assistance from the generating function. This technique will find
                                 both the homogeneous and the particular solutions for a,, and it will incorporate the given
                                 initial conditions as well. Furthermore, we’ll be able to do even more with this method.
                                     We demonstrate the method in the following examples.

Solve the relation a, — 3a,-;           =n, n>     l,ay = 1.
      EXAMPLE      10.38
                                     This relation represents an infinite set of equations:

(n = 1)       a, — 3a)
                                                                                               = 1

(n = 2)       a) — 3a,   = 2
                                                                           10.4 The Method of Generating Functions                   483

Multiplying the first of these equations by x, the second by x”, and so on, we obtain

(n       = 1)             a,x!    —       3agx!       =     1x!

(n = 2)                   ayx* — 3a,x*                = 2x?

Adding this second set of equations, we find that
                                       CO                             OO                           ox

)              Ay,x” —3         - nx"              =        )     nx",                          (1)
                                      n=l                         n=]                             n=]

We want to solve for a, in terms of n. To accomplish this, let f(x) = ye          nx” be
the (ordinary) generating function for the sequence ag, a), a2 . . .. Then Eq. (1) can be re-
written as

(f(x) = ay)              = 3x So ag!                           =e nx"                 (= dm)                       (2)
                                                        n=}                            n=]                      n=0

Since \°. ay_yx"! = ry                                 Gnx” = f(x) and ay = I, the left-hand side of Eq. (2)
becomes      (f(x) — 1) — 3xf (x).
   Before we can proceed, we need the generating function for the sequence 0, 1, 2,3,....
Recall from part (c) of Example 9.5 that

(owe                      ETO              F3         He,                    so

.
     (f(x) - 1) -3xf(x%)
                      .
                         = G_xi
                         _   x
                               ae                                     and      f(x)       =
                                                                                                   (1 — 3x) + (x2 a          23a):

Using a partial fraction decomposition, we find that
                                  x                         _         A                       B                     C
                     (l—x)2(1-—3x)                          (-—x)              + Gx                      1G         — 3x)’
or
                       x = A(1 —x)(Q — 3x) + BU -— 3x) +C(1 — x)’.
     From the following assignments for x, we get

(x= 1):                1= B(-2),                                  B=-3.

]                 1                   2\°                                3
                   (x=3);                   5-¢(5),                                    C=7

(x=0):               O=A+B4+C,                                    A=-(B+O)=-5.

Therefore,

dt              (-1/4)                (-1/2)                   (3/4)
                          a           or er                                           (i —x)2 * ( —3x)
                              _        (7/4)                     (-1/4)           1     (—1/2)
                                      (l—3x)                     (—-x)                  (l—x)"
     We find a, by determining the coefficient of x” in each of the three summands.
484         Chapter 10 Recurrence Relations

a) (7/4)/( — 3x) = 7/4) 1/(1 — 3x)]
                                                                = (7/4)[1 + Gx) + Gx)? + x)? +---],                         and the coefficient of
                                   x" is (7/4)3".
                               b) (—1/4/(        —x) =(-1/M[1+x«+x*+---],                                and        the   coefficient           of x”   here   is
                                   (—1/4).
                               ce) (-1/2)/(1 — x)? = (-1/2)(1 — x)?
                                                                 = (1 [(8) + a0 + (Ae? + (O04
                                   and the coefficient of x” is given by (—1/2)(7)(-1)"                                   = (—1/2)(—1)"(? +2 - ') .
                                   (—1)" = (-1/2)(n + 1).
                                Therefore a, = (7/4)3" — (1/2)n ~ (3/4), n = 0. (Note that there is no special concern
                             here with a\”’. Also, the same answer is obtained by using the techniques of Section 10.3.)

In our next example we extend what we learned in Example 10.38 to a second-order
                             relation. This time we present the solution within a list of instructions one can follow in
                             order to apply the generating-function method.

Consider the recurrence relation
      EXAMPLE 10.39
                                              Gn42 — 5an41           + 64, = 2,              n>Q,              do = 3,           a,=7.

1) We first multiply this given relation by x”** because n + 2 is the largest subscript
                                    that appears. This gives us

Any xt? — Sang xt? + Oayx"t? = 2x"??,

2) Then we sum all of the equations represented by the result in step (1) and obtain

oo                   oC                         oO
                                                  )            “Ansgx"?   —5   y        (Anyi x"?   +6   »     “Anx"*?      =9   >>         tt?

n=0                          n=0                       n=0                       =

3) In order to have each of the subscripts on a match the corresponding exponent on x,
                                    we rewrite the equation in step (2) as

oC                            CO                           x                       x

)             dng xt? — 5x        )     An) x"t! + 6x? )             Anx" = 2x?        )        x",
                                                 n=0                           n=0                             n=0                      n=0

Here we also rewrite the power series on the right-hand side of the equation in a form
                                    that will permit us to use what we learned in Section 2 of Chapter 9.
                                 4) Let f(x) = 3°%5 anx” be the generating function for the solutton>The equation in
                                    step (3) now takes the form

2

(f (x) — ay — ax) — Sx( f(x) — ap) + 6x? f(x) =                                        3
                                                                                                                                   1-—x

or

2x?
                                                               (f (x) —3 — 7x) — Sx( f(x) — 3) + 6x7 f(x) =                      l-x
                                                                            10.4 The Method of Generating Functions                             485

5) Solving for f(x) we have

2x?     — 3-1lx+                10x?
                                     (1 — 5x + 6x") f(x) = 3 ~ 8x + =                                                               3
                                                                                             —X               l—-x
                        from which it follows that
                                          3 — 11x + 10x?               _        (3 — 5x)(1 — 2x)                    _              3 — 35x
                         fQ@)=
                                   (1 — 5x + 6x7)(1        — x)             (l—3x)(1—2x)1—-x)                               QU~-3x)0—x)
                        A partial-fraction     decomposition               (by hand,        or via a computer                algebra         system)
                        gives us

2                  1     =                 h     =            h
                                             f@)=       Toax               Toe    76                     +2o

Consequently, a, = 2(3") + 1,n>0.

We consider a third example, which has a familiar result.

EXAMPLE   10.40   LetneN.     For r > 0, let a(n, r) = the number of ways we can select, with repetitions
                  allowed, r objects from a set of n distinct objects.
                     Forn > 1, let {b1, b2, ... , b,} be the set of these objects, and consider object b,. Exactly
                  one of two things can happen.

a) The object b; is never selected. Hence the r objects are selected from {b2,..., by}.
                       This we can do in a(n — 1, r) ways.
                    b) The object b; is selected at least once. Then we must select r — 1 objects from
                       {b), b2,..., bn}, SO we can continue to select b; in addition to the one selection
                       of it we’ ve already made. There are a(n, r — 1) ways to accomplish this.

Then a(n, r) = a(n — 1, r) + a(n, r — 1) because these two cases cover all possibilities
                  and are mutually disjoint.
                      Let fr =     an a(n, r)x" be the generating function for the sequence a(n, 0), a(n, 1),
                  a(n, 2),....    [Here f, is an abbreviation for f,(x).] From a(n,r)=a(n—-1,r)+
                  a(n,r — 1), where n > 1 andr > 1, it follows that

a(n, r)x’ =a(n—1,r)x"                  +a(n,r—1)x"                  and

yan.        rx" = S*aln ~ rx"                           + Sain, r—1)x".
                                    r=]               r=l                                     r=1
                     Realizing that a(n, 0) = 1 forn > 0 and a(0, r) = 0 for r > 0, we write

fn — a(n, 0) = fr-1 -aQn— 1,0) +x So ay r = 1x",
                                                                                              r=]

SO fp —1= fn-1 —14+xf,. Therefore, f, —xfn = fr-1,0r fn = fn-i/(1 — x).
                     If n = 5, for example, then

Fa              1             FB                   fs                hr                    fi
                           fs=
                                  G—-x) (-x) G-x)                                (G-x? (—-»? Gx!
                                     fo _  1
                                 ~ d=x)          (=x)
                  since fy = a(0, 0) + a(0, 1)x +. a(O, 2)x74+---=14+04+04+---.
486         Chapter 10 Recurrence Relations

In general, f, = 1/(1 — x)" = (1 — x)", soa(n, r) is the coefficient of x’in (1 — x)™,
                             which is (5")(—1)" = ("*77').
                                 {Here we dealt with a recurrence relation for a(n, r), a discrete function of the two
                             (integer) variables n, r > 0.]

Our last example shows how generating functions may be used to solve a system of
                             recurrence relations.

This example provides an approximate model for the propagation of high- and low-energy
|     EXAMPLE 10.41 |        neutrons as they strike the nuclei of fissionable material (such as uranium) and are absorbed.
                             Here we deal with a fast reactor where there is no moderator (such as water). (In reality,
                             all the neutrons have fairly high energy and there are not just two energy levels. There is a
                             continuous spectrum of energy levels, and these neutrons at the upper end of the spectrum
                             are called the high-energy neutrons. The higher-energy neutrons tend to produce more new
                             neutrons than the lower-energy ones.)
                                 Consider the reactor at time 0 and suppose one high-energy neutron is injected into the
                             system. During each time interval thereafter (about 1 microsecond, or 107° second) the
                             following events occur.

a) When a high-energy neutron interacts with a nucleus (of fissionable material), upon
                                  absorption this results (one microsecond later) in two new high-energy neutrons and
                                  one low-energy one.
                               b) For interactions involving a low-energy neutron, only one neutron of each energy level
                                  is produced.

Assuming that all free neutrons interact with nuclei one microsecond after their creation,
                             find functions of n such that
                                                       a, = the number of high-energy neutrons,
                                                       b, = the number of low-energy neutrons,

in the reactor after n microseconds, n > 0.
                                 Here we have ay = 1, bg = 0 and the system of recurrence relations

ant)        =     24y   +   bn                            (3)

Dn4t        = dy + bn.                                    (4)

Let f(x) = oy       anx", g(x) = yy                   bax” be the generating functions for the se-
                             quences {a,|n > 0}, {b,|n > 0}, respectively. From Eqs. (3) and (4), when n > 0
                                                              Ana x"!         _—   2a,x"!           +   b,x"?!                  By

byw x"t!    = apx"t! + byxtt},                                    (4)'

Summing Eq. (3)’ over all n > 0, we have
                                                        oO                              x                   oC

s- Any xr         =2x                 > Anx"” +x          > b,x".         (3)"
                                                       n=()                            n=(0                n=0

In similar fashion, Eq. (4)’ yields
                                                         x                             oC                  aC

- byyix"t! =x >                      nx” +x >           b,x".          (4)”
                                                        n=0                            n=0                n=0
                                                                                                             10.5 A Special Kind of Nonlinear Recurrence Relation (Optional)                                                       487

Introducing the generating functions at this point, we get

F(x) — ao = 2xf (x) + x8(X)                                                                                (3)"
                                                                                                                        B(x) — bo = xf (x) + xg(X),                                                                                (4)"
                                                  a system of equations relating the generating functions. Solving this system, we find that

oo                               lax              5405                                    1                         5— 4/5                   1
                                                             f=                         aa                    =(                10        \(S)+(                                       10       i)                      and
                                                                  «)            =                   x           _               oo)                             1            \,        —5 + 3/5                    1 )
                                                             6) ==                                            -(                     10                (—)                         (            10     \Gs
                                                  where

V       =
                                                                                                                                34+ 5          ,            é       =
                                                                                                                                                                        3 — V5
                                                                                                                                      2                                       2
                                                  Consequently,

an =                                                                         +                                                              and
                                                                                         10                     2                                  10                         2

by            =                       en                                          +       TO                                                          4   nr    =>   0.
                                                                                              10                        2                                       10                          2

O<r<n. Here a(n, r) = 0 when r > nxn. Use the recurrence
                                  EXERCISES 10.4                                                                                relation a(n, r) =a(n —1,r—1)+a(n—1,r), wheren > 1
              ;     .                  ;                                                                                        and r > 1, to show that f(x) = (1 +x)” generates a(n, r),
1. Solve the following recurrence relations by the method of                                                                    r>0
generating functions.                                                                                                            —
                                                                                                                                3. Solve the following systems of recurrence relations.
  a)   Qn41   —    ay   = 3’,          n>=0,       ay    =    1
                                                                                                                                     a)   Qn41     =    —2a,, ~ Ab,,
              _—        =    72                          =
  b)   an+1        ay       nr,        n=   0),    ag         ]                                                                           Dnt      _    4a, + 6b,

C) Qn42
        — 3@n41              + 2a, = 0,             n=O,                   ao = 1,             a, =6                                      n>0,              ao=1,             by =0

d)   Gn42   —    2An+1     +    Ay    =2",        n= 0,                  ao       =   1,     a        =2                           b)   Gn+)     =    2a,         —   by   +2

2. Forn distinct objects, let a(n, r) denote the number of ways                                                                           bay) = —@, + 2b, — 1
we can select, without repetition, r of the 2 objects when                                                                                n>0,    a,=0,    bo =1

10.5
              A Special Kind of Nonlinear
         Recurrence Relation (Optional)
                                                  Thus far our study of recurrence relations has dealt with linear relations with constant
                                                  coefficients. The study of nonlinear recurrence relations and of relations with variable
                                                  coefficients is not a topic we shall pursue except for one special nonlinear relation that
                                                  lends itself to the method of generating functions.
                                                     We shall develop the method in a counting problem on data structures. Before do-
                                                  ing so, however, we first observe that if f(x) =      an a;x' is the generating function
                                                  for ay, a, @2,..., then [f(x)]? generates apap, aoa + 41a, doa2 + aa; + 42d0,...,
488            Chapter 10 Recurrence Relations

QAn   + An)      + Q2Gn-2 + +++ + aya)          + anao,...,         the     convolution         of   the   sequence
                                dy, 41, 42, ..., with itself.

In Sections 3.4 and 5.1, we encountered the idea of a tree diagram. In general, a tree is
      EXAMPLE 10.42
                                an undirected graph that is connected and has no loops or cycles. Here we examine rooted
                                binary trees.
                                    In Fig. 10.17 we see two such trees, where the circled vertex denotes the root. These trees
                                are called binary because from each vertex there are at most two edges (called branches)
                                descending (since a rooted tree is a directed graph) from that vertex.
                                    In particular, these rooted binary trees are ordered in the sense that a left branch descend-
                                ing from a vertex is considered different from a right branch descending from that vertex.
                                For the case of three vertices, the five possible ordered rooted binary trees are shown in
                                Fig. 10.18. (If no attention were paid to order, then the last four rooted trees would be the
                                same structure.}

A
                                                                      (1)            (2)                  (3)             (4)               (5)
Figure 10.17                                                    Figure 10.18

Our objective is to count, for n > 0, the number b,, of rooted ordered binary trees on n
                                vertices. Assuming that we know the values of b; for 0 <i <n, in order to obtain b,,) we
                                select one vertex as the root and note, as in Fig. 10.19, that the substructures descending on
                                the left and right of the root are smaller (rooted ordered binary) trees whose total number of
                                vertices is n. These smaller trees are called subtrees of the given tree. Among these possible
                                subtrees is the empty subtree, of which there is only 1 (= bo).

Left                Right
                                                                      subtree              subtree
                                                                Figure 10.19

Now consider how the n vertices in these two subtrees can be divided up.
                                 10.5 A Special Kind of Nonlinear Recurrence Relation (Optional)                      489

(1) 0 vertices on the left, m vertices on the right. This results in bp), overall sub-
               structures to be counted in b, 41.
           (2) 1 vertex on the left, n — 1 vertices on the right, yielding b;b,_, rooted ordered
                 binary trees on n + | vertices.

(i + 1) i vertices on the left, n — i on the right, for a count of b;b,_; toward by+1.

(n + 1) n vertices on the left and none on the right, contributing 5,,b9 of the trees.
      Hence, for all n > 0,

basi = bobn + bybn-1 + baby-2 ++ ++ + bn_1b1 + dn bo,
and
                   Cw                     oO

De oni t= YO Gobn + bibn-1 +++ + Bnd + Bnbo)x"*.                                                      (1)
                 n=0                  n=0

Now let f(x) = )°~, b,x" be the generating function for bp, 5, b2,.... We rewrite
Eq. (1) as

Cf (x) = bo) = x Do (Boba + biDn-1 +++ + + baby)x” = xLf OP.
                                      n=0

This brings us to the quadratic [in f(x)]
                  x{fa@)P       ~ f(x)+1=0,                  so    fi)           =[lt+v1—4x]/(x).

But /1 —4x = (1 — 4x)! = (‘.") + (1(?) (4x) + (12?) (4x)? 4... , where the
coefficient of x”,        > 1, is

(122)
            1/2 ap = CC            (1/2) =— 2)---
                     1/2)(11/2) — 1)((1/2)   2) (1/2)
                                                  (U/)—H - +N ] 4,
             n                                                     n!

_   (-1ye-t§
                                               1/2)(1/2)(3/2)
                                                / ¢ / )¢ / -
                                                              --- (( (Qn n — 3)/2
                                                                               )/ (ay

_ (12"(D@G)--- Ga — 3)
                                                    n!
                            _ (-1)2"(n!)(1)3) «+ - Qn — 3)(2n — 1)
                            7            (n!)(n!)(2n — 1)
                            _ DQ4) ++ n)\(G)- n=) _ CD                                                     (°")
                                                 (2n — 1)(n!)(n!)                               (2n — 1)          ,
      In f (x) we select the negative radical; otherwise, we would have negative values for the
b,’s. Then

1                       oe             1       2n\   ,
                           FO) = x             phy                  ata (es                    |]

and b,,, the coefficient of x” in f (x), is half the coefficient of x”*+! in

S.        1              *)       h

n=1
                                                   aol,                          a
490         Chapter 10 Recurrence Relations

So

b       =   1      1             2(n+1)\    _            (2n)!      _         1    2n
                                            "       2,[2@+)-1             n+]             n+D!m!)               (nt+)\a/
                             The numbers b, are called the Catalan numbers     — the same sequence of numbers we en-
                             countered in Section 1.5. As we mentioned earlier (following Example 1.42), these numbers
                             are named after the Belgian mathematician Eugene Charles Catalan (1814-1894), who used
                             them in determining the number of ways to parenthesize the expression x|%2X3 + - + X,. The
                             first nine Catalan numbers are by = 1, Db} = 1, bo = 2, b3 = 5, bg = 14, bs = 42, Dg = 132,
                             b; = 429, and bg = 1430.

We continue now with a second application of the Catalan numbers. This is based on an
                             example given by Shimon Even. (See page 86 of reference [6].)

An important data structure that arises in computer science is the stack. This structure allows
      EXAMPLE 10.43
                             the storage of data items according to the following restrictions.
                                  1) All insertions take place at one end of the structure. This is called the top of the stack,
                                     and the insertion process is referred to as the push procedure.
                                  2) All deletions from the (nonempty) stack also take place from the top. We call the
                                     deletion process the pop procedure.
                                 Since the /ast item inserted in this structure is the first item that can then be popped out
                             of it, the stack is often referred to as a “last-in-first-out’” (LIFO) structure.
                                 Intuitive models for this data structure include a pile of poker chips on a table, a stack
                             of trays in a cafeteria, and the discard pile used in playing certain card games. In all three
                             of these cases, we can only (1) insert a new entry at the top of the pile or stack or (2) take
                             (delete) the entry at the top of the (nonempty) pile or stack.
                                 Here we shall use this data structure, with its push and pop procedures, to help us permute
                             the (ordered) list 1, 2,3,...,m,          form € Z*. The diagram in Fig.               10.20 shows how each
                             integer of the input 1, 2, 3, ...,   must be pushed onto the top of the stack in the order
                             given. However, we may pop an entry from the top of the (nonempty) stack at any time.
                             But once an entry is popped from the stack, it may not be returned to either the top of the
                             stack or the input left to be pushed onto the stack. The process continues until no entry is
                             left in the stack. Thus the ordered sequence of elements popped from the stack determines
                             a permutation of 1, 2, 3,..., 7.

Output                wae               2,3,...,9   Input

Stack

Figure 10.20
                             10.5 A Special Kind of Nonlinear Recurrence Relation (Optional)   491

If n = 1, our input list consists of only the integer 1. We insert | at the top of the (empty)
stack and then pop it out. This results in the permutation 1.
    For n = 2, there are two permutations possible for 1, 2, and we can get both of them
using the stack.

1) To get 1, 2 we place 1 at the top of the (empty) stack and then pop it. Then 2 is placed
      at the top of the (empty) stack and it is popped.
   2) The permutation 2, 1 is obtained when 1 is placed at the top of the (empty) stack and
      2 is then pushed onto the top of this (nonempty) stack. Upon popping first 2 from the
      top of the stack, and then 1, we obtain 2, 1.

Turning to the case where n = 3, we find that we can obtain only five of the 3! = 6
possible permutations of 1, 2, 3 in this situation. For example, the permutation 2, 3, 1
results when we take the following steps.

© Place 1 at the top of the (empty) stack.
   ¢ Push 2 onto the top of the stack (on top of 1).
   © Pop 2 from the stack.
   e Push 3 onto the top of the stack (on top of 1).
   e Pop 3 from the stack.
  © Pop | from the stack, leaving it empty.

The reason we fail to obtain all six permutations of 1, 2, 3 is that we cannot generate
the permutation 3, 1, 2 using the stack. For in order to have 3 in the first position of the
permutation, we must build the stack by first pushing | onto the (empty) stack, then pushing
2 onto the top of the stack (on top of 1), and finally pushing 3 onto the stack (on top of 2).
After 3 is popped from the top of the stack, we get 3 as the first number in our permutation.
But with 2 now at the top of the stack, we cannot pop | until after 2 has been popped, so
the permutation 3, 1, 2 cannot be generated.
   When n   = 4, there are 14 permutations of the (ordered) list 1, 2, 3, 4 that can be generated
by the stack method. We list them in the four columns of Table 10.4 according to the loca-
tion of 1 in the permutation.

Table 10.4

1, 2, 3,4          2,1,3,4           2,3,1,4          2,3,4,1
                    1,2, 4,3           2,1,4,3           3,2,1,4          2,4, 3,1
                    1,3,2,4                                               3,2,4,1
                    1,3,4,2                                               3,4,2,1
                    1,4, 3,2                                              4,3,2,1

1) There are five permutations with 1 in the first position, because after | is pushed onto
      and popped from the stack, there are five ways to permute 2, 3, 4 using the stack.
   2) When | is in the second position, 2 must be in the first position. This is because we
      pushed | onto the (empty) stack, then pushed 2 on top of it and then popped 2 and
      then 1. There are two permutations in column 2, because 3, 4 can be permuted in two
      ways on the stack.
492   Chapter 10 Recurrence Relations

3) For column 3 we have | in position three. We note that the only numbers that can
                             precede it are 2 and 3, which can be permuted on the stack (with 1 on the bottom) in
                             two ways. Then | is popped, and we push 4 onto the (empty) stack and then pop it.
                          4) In the last column we obtain five permutations: After we push 1 onto the top of the
                             (empty) stack, there are five ways to permute 2, 3, 4 using the stack (with 1 on the
                             bottom). Then 1 is popped from the stack to complete the permutation.

On the basis of these observations, for 1 <i <4,                    let a; count the number of ways to
                       permute the integers 1, 2,3,..., i (or any list of 7 consecutive integers) using the stack.
                       Also, we define ag = 1 since there is only one way to permute nothing, using the stack.
                       Then

a4 = A9a3       + 1a.   + 420)        + 43a,

where

a) Each summand a,q, satisfies j + k = 3.
                         b) The subscript j tells us that there are j integers to the left of 1 in the permutation
                                                                                                                — in
                            particular, for j > 1, these are the integers from 2 to j + 1, inclusive.
                         c) The subscript & indicates that there are & integers to the right of 1 in the permutation—
                            for k > 1, these are the integers from 4 — (k — 1) to 4.

This permutation problem can now be generalized to any n € N, so that

Ant)   = Ayan    TA, Ay)        + A2An—2   +>         + + Gn-14,   + Ando,

with ag = 1. From the result in Example 10.42 we know that
                                                                            1      (*")
                                                               ay   —                         .

(n+1)\n

Now let us make one final observation about the permutations in Table 10.4. Consider,
                       for example, the permutation 3, 2, 4, 1. How did this permutation come about? First 1 is
                       pushed onto the empty stack. This is then followed by pushing 2 on top of | and then
                       pushing 3 on top of 2. Now 3 is popped from the top of the stack, leaving 2 and 1; then 2
                       is popped from the top of the stack, leaving just 1. At this point 4 is pushed on top of | and
                       then popped, leaving | on the stack. Finally, 1 is popped from the (top of the) stack, leaving
                       the stack empty. So the permutation 3, 2, 4, 1 comes about from the following sequence of
                       four pushes and four pops:

push, push, push, pop, pop, push, pop, pop.

Now replace each “push” with a “1” and each “pop” with a “0”. The result is the sequence

1   1100                 1     0       0.

Similarly, the permutation 1, 3, 4, 2 is determined by the sequence

push, pop, push, push, pop, push, pop, pop

and this corresponds with the sequence

101101                          0       0.

In fact, each permutation in Table 10.4 gives rise to a sequence of four |’s and four 0’s.
                       But there are 8!/(4! 4!) = 70 ways to list four 1’s and four 0’s. Do these 14 sequences have
                       some special property? Yes! As we go from left to right in each of these sequences, the
                                                                     10.5 A Special Kind of Nonlinear Recurrence Relation (Optional)                       493

number of 1’s (pushes) is never exceeded by the number of 0’s (pops) [just like in part (b)
                                of Example 1.43 — another situation counted by the Catalan numbers].

Our last example for this section is comparable to Example 10.17. Once again we see
                                that we must guard against trying to obtain a general result without a general argument — no
                                matter what a few special cases might suggest.

Here we start with n distinct objects and, for n > 1, we distribute them among                                        at most n
   EXAMPLE 10.44
                                identical containers, but we do not allow more than three objects in any container, and we
                                are not concerned about how the objects are arranged within any one container. We let a,
                                count the number of these distributions, and from Fig. 10.21 we see that

ay = 1,            a, = 1,           a) = 2,           a,j=5,        and        a,= 14.
                                It appears that we might have the first five terms in the sequence of Catalan numbers.
                                Unfortunately, the pattern breaks down and we find, for example, that
                                                                 as = 46 # 42 (the sixth Catalan number)                     and
                                                                 dg = 166 # 132 (the seventh Catalan number).

(The distributions in this example were studied by F. L. Miksa, L. Moser, and M. Wyman
                                in reference [22].)

C
                                                                        B                         B       C         C                B
                                                           A            A       Aj{B              A       BiA       A    B           AJC        A; BIC

(n = 0)            (n = 1)         (n = 2)                 (n
                                                                                              = 3)

C               D         D             D
                                               B               B         C             C          BID       C}D          DIC                      D
                                               AID             AIC       A|B           BIA        Alc       A\B          A|B              AIBIC

(n = 4)
                                Figure 10.21

Other examples that involve the Catalan numbers can be found in the chapter references.

3. Show that for all n > 2,
                        EXERCISES
                                                                                             (”       - ') _ (     = )                1     (*")
                                       a                                                                                                       a}
1. For the rooted ordered binary trees of Example 10.42,                                             a          n—2               (n+ 4)
calculate by and draw all of these four-vertex structures.                       4. Which of the following permutations of 1, 2, 3, 4, 5, 6, 7,8
2. Verify that for all x > 0                                                   can be obtained using the stack (of Example 10.43)?
         1       1        on +2                i          an                           a) 4,2,3,1,5,6,7,8                 b) 5, 4, 3, 6, 2, 1, 8,7
             gaa) Oe) = Gas)                           C).                              ce) 4,5,3,2,1,8,6,7                  d) 3,4,2,1,7,6,8,5
494               Chapter 10 Recurrence Relations

5. Suppose that the integers 1, 2, 3, 4, 5, 6, 7, 8 are permuted    2 and 5 —and the sides labeled ab, c and (ab)c provide a sec-
using the stack (of Example 10.43). (a) How many permutations         ond interior triangle for this triangulation. Continuing in this
are possible? (b) How many permutations have | in position 4          way, we label the base edge connecting vertices 1 and 2 with
and 5 in position 8? (c) How many permutations have 1 in po-          ((ab)c)d — one of the five ways we can introduce parentheses
sition 6? (d) How many permutations start with 321?                   in order to obtain the three products (of two numbers at a time)
                                                                      needed to compute abcd. The triangulation in part (11) of the
  6. This exercise deals with a problem that was first proposed
                                                                      figure corresponds with the parenthesized product (ab) (cd).
by Leonard Euler. The problem examines a given convex poly-
gon of n (= 3) sides — that is, a polygon of » sides that satisfies       a) Determine the parenthesized product involving a, b, c,
the property: For all points P,;, P, within the interior of the           d for the other three triangulations of the convex pentagon.
polygon, the line segment joining P; and P, also lies within              b) Find the parenthesized product for each of the triangu-
the interior of the polygon. Given a convex polygon of 7 sides,           lated convex hexagons in parts (iii) and (iv) of Fig. 10.22
Euler wanted to count the number of ways the interior of the          [From part (a) we learn that there are five ways to parenthesize
polygon could be triangulated (subdivided into triangles) by          the expression abcd (and five ways to triangulate a convex pen-
drawing diagonals that do not intersect.                              tagon). Part (b) shows us two of the 14 ways one can introduce
     For a convex polygon of n > 3 sides, let f, count the num-       parentheses for the expression abcde (and triangulate a convex
ber of ways the interior of the polygon can be triangulated by        hexagon). In general, there are aeT (2") ways to parenthesize the
drawing nonintersecting diagonals.                                    expression xX1X2X3 ++ + X_—1X_Xn_41- It was in solving this prob-
      a) Define t. = 1 and verify that                                lem that Eugéne Charles Catalan discovered the sequence that
                                                                      now bears his name.}
               trot = tot, + taty-1 Fo        + hy-18g + tno.
      b) Express f, as a function of n.                                8. Forn > 0,

7. In Fig. 10.22 we have two of the five ways in which we can
triangulate the interior of a convex pentagon with no intersect-                             mn = (= ) (7)
ing diagonals. Here we have labeled four of the sides — with          is the nth Catalan number.
the letters a, b, c, d—as well as the five vertices. In part (i)
                                                                          a) Show that for all n > 0,
we use the labels on sides a and b to give us the label ab on
                                                                                                          2(2n + 1)
the diagonal connecting vertices 2 and 4. This is because this                                   bp     = ————
diagonal (labeled ab), together with the sides a and b, provides                                           (n + 2)
us with one of the interior triangles for this triangulation of           b) Use the result of part (a) to write a computer program
the convex pentagon. Then the diagonal ab and the side c give             (or develop an algorithm) that calculates the first 15 Catalan
rise to the label (ab)c on the diagonal determined by vertices            numbers.
                                                                        9. Forn > 0, evenly distribute 2” points on the circumference
                                                                      of a circle, and label these points cyclically with the integers
                                                                      1,2,3,..., 22. Let a, be the number of ways in which these
                                                                      2n points can be paired off as n chords where no two chords
                                                                      intersect. (The case for n = 3 is shown in Fig. 10.23.) Find and
                                                                      solve a recurrence relation for a,, 2 > 0.

10. For n EN, consider all paths from (0, 0) to (2n, 0) us-
                                                                      ing the moves N: (x, y) > (w+ 1, ¥4+1) and Si: (xy, y)>
              2     (ab)dd   1            2    (ab\(cd)   1           (x + 1, y— 1), where any such path can never fall below the
                                                                      x-axis. The five paths (generally called mountain ranges) for
                       (i)                        (ii)
                                                                      n = 3 are shown in Fig. 10.24. How many mountain ranges are
                                                                      there for each n € N? (Verify your claim!)

11. Forn € Z*, let f: {1, 2,...,n}— {1, 2,...,                 }, wheref
                                                                      is monotone increasing [that is, ] <i < j<n=> f(i) < f(j)]
                                                                      and f(i) >i for all 1 <i <n. (a) Determine the five mono-
                                                                      tone increasing functions f:{1, 2,3}— {1, 2,3}, where
                                                                      f(i) =i for all | <i <3. (b) Use the graphs of the func-
                                                                      tions from part (a) to set up a one-to-one correspon-
                                                                      dence   with   the paths   from   (0, 0)   to (3, 3)   using   the moves
                     (iii)                       (iv)
                                                                      R: (x, y) > («+ 1, y), U: (%, y) > (x, y+ 1), where each
        Figure 10.22                                                  such path never falls below the line y = x. (The reader may
                                                                          10.5 A Special Kind of Nonlinear Recurrence Relation (Optional)                 495

2                            2

1                      3]       1             3 | 1

6                      4 | 6                  416

5                            5

Figure 10.23

<n.] (c) How        many functions g have domain and codomain
         y                                      y
                                                                                     equal to {1, 2, 3,....n}, forn €Z*, and satisfy g(i) <i for
     3                                      3                                        alll <i<n?
     2                                      2
                                                                                     13. For n €N, consider the arrangements of pennies built on a
     ]                                      |
                                                                                     contiguous row of m pennies. Each penny that is not in the bot-
                                     x
             123456                                     123456                       tom row (of # pennies) rests upon the two pennies below it, and
                                                                                     there is no concern about whether heads or tails appears. The
         y                                      y                                    situation for x = 3 is shown in Fig. 10.25. How many such ar-
     3                                      3                                        rangements are there for a contiguous row of n pennies, n € N?
     2                                      2                                       14. Forn EN, let s, count the number of ways one can travel
     1                                      1                                       from (0, 0) to (n, n) using the moves R: (x, y) > (x + 1, y),
                                     x
             123456                                     123456                      U: (x, y) > (x, y+ 1),D: (, y) > (« +1, y +1),         where the
                                                                                    path can never rise above the line y = x. (a) Determine 5).
                             y                                                      (b) How is s> related to the Catalan numbers bo, 8, b2? (c) How
                        3                                                           is $3 related to bo, b|, bo, b3? What is 53? (d) For n € N, how
                        2                                                           is s, related to bo, by, bo, ..., b,? (The numbers so, 5), 52, ...
                        1                                                           are known as the Schréder numbers.)
                                                          x
                                                                                     15. A   one-to-one         function   f/f: {1,2,3,...,a}—       {], 2, 3,
                                 123456
                                                                                          _n} is often called a permutation. Such a permutation is
   Figure 10.24                                                                      termed  a rise/fall permutation when f (1) < f(2), f(2) > f(3),
                                                                                     FQ) < f(®,.... For example, ifn = 4 the five permutations
wish to check Exercise 3 for Section 1.5.) (c) If the paths in                       1324 (where f(1) = 1, f(2) = 3, f(3) =2, f(*) =, 1423,
part (b) are rotated clockwise through 45°, what results do we                       2314, 2413, and 3412 are the rise/fall permutations (for 1, 2,
find? (d) How many monotone increasing functions f have do-                          3, 4). This we denote by writing E, = 5, where, in general, EF,
main and codomain equal to {1, 2,3, ..., a}, forn € Z*, and                          counts the number of rise/fall permutations for 1, 2,3,...,             7.
satisfy f(@@) > i forall1 <i <n?                                                     The numbers Ey, £,, E>, E3,... are called the Euler numbers
                                                                                     (not to be confused with the Eulerian numbers in Example 4.21).
12. For ne Z*, let g:{1,2,...,n}— {1,2,....n}, where                                 We define Ey = 1 and find that £) = 1, Ey = 1.
efi) <i for all 1<i<n. (a) Determine the five func-
                                                                                         a) Find the rise/fall permutations for 1, 2, 3. What is £3?
tions g: {1, 2, 3} > {1, 2, 3} where g(f) <i for all 1 <7 <3.
(b) Set up a one-to-one correspondence between the functions                             b) Find the rise/fall permutations for 1, 2, 3, 4, 5. What
in part (a) here and those in part (a) of the previous exer-                             is Es?

cise. [You want a one-to-one correspondence that will gener-                             c) Explain       why    in each   rise/fall   permutation   of   1, 2,
alize when you examine the functions f, g: {1,2,...,n}—>                                 3,...n,       we find n at position 27 for some      | <i   < |[n/2],
{1,2,...,n},n © Z*, where f(i) >i and g(i) <i forall 1 <i                                ifn >    1.

Figure 10.25
496             Chapter 10 Recurrence Relations

d) Forz     > 2, show that                                                        g) Prove that for n > 2,
                    l/2l   py
          Ey, =   Ye (Fo )BotBnan
                    r=1
                                     Fy,     En-1,     Eo   Bo   = Ey = 1.                   E,
                                                                                                    (3) U ("7 |)Bte
                                                                                                       \
                                                                                                     -{-
                                                                                                           i     m-i
                                                                                                                           Boe
                                                                                                                E,E,,-1. Ey = E, =1.
                                                                                                           :=0
      e) Where do we        find    1 in a rise/fall             permutation   of       h)   Use the result in part (g) to find Fs and £7.
      1,2,3,...,n?
                                                                                         i) Find the Maclaurin series expansion for f (x) = sec x +
      {) Forn > 1, show that                                                            tan x. Conjecture (no proof required) the sequence for
                                                                                        which this is the exponential generating function.
            E, = ‘SS (", ')                    En-y-1, Ey = 1.

10.6
                   Divide-and-Conquer
                  Algorithms (Optional)*
                                     One of the most important and widely applicable types of efficient algorithms is based on
                                     a divide-and-conquer approach. Here the strategy, in general, is to solve a given problem
                                     of size n (n € Z*) by
                                           1) Solving the problem for a small value of n directly (this provides an initial condition
                                              for the resulting recurrence relation).
                                          2) Breaking the general problem of size n into a smaller problems of the same type
                                             and (approximately) the same size— either [n/b] or |[n/b],* where a, b € Z* with
                                             l<a<nand!<b<n.

Then we solve the a smaller problems and use their solutions to construct a solution for the
                                     original problem of size n. We shall be especially interested in cases where n is a power of
                                     b, and b = 2.
                                        We shall study those divide-and-conquer algorithms where

1) The time to solve the initial problem of size n = 1 is a constant c > 0, and
                                          2) The time to break the given problem of size n into a smaller (similar) problems,
                                             together with the time to combine the solutions of these smaller problems to get a
                                             solution for the given problem, is h(n), a function of n.

Our concern here will actually be with the time-complexity function f(n) for these
                                    algorithms. Consequently, we shall use the notation f(n) here, instead of the subscripted
                                    notation a, that we used in the earlier sections of this chapter.
                                       The conditions that have now been stated lead to the following recurrence relation.
                                                                   fQ) =e,
                                                                   f(n) =af(n/b) thin),           — forn=be,                ok > 1.
                                    We note that the domain of f is {1, b, b?, b?,...} = {b'|i                       EN} CZ.

‘The material in this section may be skipped with no loss of continuity. It will be used in Section 12.3
                                    to determine the time-complexity function for the merge sort algorithm. However, the result there will also be
                                    obtained for a special case of the merge sort by another method that does not use the material developed in this
                                    section.
                                          For each x € R, recall that [x] denotes the ceiling of x and |x| the floor of x, or greatest integer in x, where
                                              a) [x] = [x] =x, forx €Z.
                                             b) |x] = the integer directly to the left of x, forx ¢ R — Z.
                                              c) [x] = the integer directly to the right of x, for x € R — Z.
                                                                       10.6 Divide-and-Conquer Algorithms (Optional)               497

In our first result, the solution of this recurrence relation is derived for the case where
               h(n) is the constant c.

THEOREM 10.1   Leta, b,c € Z* with
                                 b > 2, and let f:Z* > R. If
                                       fd)=c,                  and

f(n) = af (n/b) +c,                             forn = b*,                k>1,
               then for all n = 1, b, b?, b?,...,
                  1) f(@) = c(log, n + 1), whena = 1, and
                                    log,a __           l
                  2) f(n) = a.                              when     a > 2.
                                   a    —

Proof: For k > 1 and n = b*, we write the following system of k equations. [Starting with
               the second equation, we obtain each of these equations from its immediate predecessor by
               (i) replacing each occurrence of n in the prior equation by n/b and (1i) multiplying the
               resulting equation in (i) by a.]

f(n) = af(n/b) +c
                                                             af (n/b) = a? f(n/b*) +.ac
                                                           a’ f (n/b*) = a’ f (n/b’) +a°c

ak? f(n/b*-*)          — ak      f(n/bk-!)        + ak¢

ak" f(n/b*') = ak f(n/b*) + ak!
               We see that each of the terms af (n/b), a? f (n/b?), ..., a*~' f (n/b*') occurs one time as
               a summand on both the left-hand and right-hand sides of these equations. Therefore, upon
               adding both sides of the k equations and canceling these common summands, we obtain

fin) =a* f(n/b’) +[etactare+-+-+ak'c].
               Since n = b‘ and f(1) = c, we have

fy =a fDt+elltatat+--.
                                                            +a"
                                                       =c[lL+ata+---ta'+a*],
                  1) Ifa = 1, then f(7) = c(k +1). Butn = D* & log, n =k, so f(n) = c(log, n + 1),
                     forn € {b'|i EN}.
                                                               e(1 — aft!)         _ c(ak*! —
                  2) When a > 2, then f(n) =                                                            , from identity 4 of Table 9.2.
                                                                     l~-a                   a-1
                     Now n = b‘ <> log, n = k, so
                                            ak’    =   gS      n—    (p'08o   4y!08»   ne   (H'08   nylog,   a   =   phn   @

and

f(a) = —————,,__                            forne {b'|i EN}.
498           Chapter 10 Recurrence Relations
                                                oy     +

fd) =3,       and
                                                               f(a) = f(n/2) +3,              forn = 2,         keZ,
                                          So by part (1) of Theorem             10.1, with    c = 3, b = 2, and a = 1, it follows   that
                                     f (n) = 3(log, 2 + 1) forn € {1, 2, 4, 8, 16, ...}.
                                 b) Suppose that g: Z* > R with
                                                               g(1)=7,        and
                                                               g(n) = 4g(n/3) +7,             forn = 3*,        ke Zt.
                                          Then with c = 7, b = 3, and a = 4, part (2) of Theorem 10.1 implies that g(n) =
                                     (7/3)(4n'e34 — 1), when n € {1, 3, 9, 27, 81, ...}.
                                  c) Finally, consider h: Zt > R, where

hA(1)=5,       and
                                                               h(n) = Th(n/7) +5,             forn = 7,         keZt.
                                         Once again we use part (2) of Theorem 10.1, this time with a = b = 7 andc = 5.
                                     Here we learn that h(n) = (5/6)(7n'°8"7 — 1) = (5/6)(7n — 1) for ne {1, 7, 49,
                                     343, ...}.

Considering Theorem 10.1, we must unfortunately realize that although we know about
                               f forn € {1, b, b*,...}, we cannot say anything about the value of f for the integers in
                               Zt —{1, b, b?, .. .}. So at this time we are unable to deal with the concept of f as a time-
                               complexity function. To overcome this, we now generalize Definition 5.23, wherein the
                               idea of function dominance was first introduced.

Definition 10.1         Let f, g:Z* — R with S an infinite subset of Z*. We say that g dominates f on S (or f is
                               dominated by g on S) if there exist constants m € Rt andk € Z* such that | f(n)| < m|g(n)|
                               for alln € S, where n > k.
                                   Under these conditions we also say that f € O(g) on S.

> R be defined so that
                                      ~    Ft
      EXAMPLE 10.46            Let f: Z*
                                                                f(n) =n,            forn €{1,3,5,7,...}
                                                                                                    = S),
                                                                f(x) =n’,           forn € {2, 4, 6,8,   ...} = Sp.

Then f € O(n) on S; and f € O(n*) on S;. However, we cannot conclude that f € O(n).

From Example 10.45, it now follows from Definition 10.1 that
      EXAMPLE 10.47
                                  a) f € O(log, n) on (2*|k € N}                             b) g € O(n'@®4) on (3*|k EN}
                                  c) h € O(n) on {7*|k EN}.
                                                            10.6 Divide-and-Conquer Algorithms (Optional)        499

Using Definition 10.1, we now consider the following corollaries for Theorem           10.1. The
                 first is a generalization of the first two results given in Example 10.47.

COROLLARY 10.1   Let a, b, ce Z* with b > 2, and let f: Z* > R. If
                                      fQ)=c,          and

f(n) =af(n/b) +c,               forn = b*,         k>1,
                 then

1) f € O(log, n) on {b*|k ¢ N}, when a = 1, and
                     2) f € O(n'®% “) on {b*|k EN}, when a > 2.

Proof: This proof is left as an exercise for the reader.

This second corollary changes the equal signs of Theorem 10.1 to inequalities. As a
                 result, the codomain of f must be restricted from R to R* U {0}.

COROLLARY 10.2   For a, b, ce Z* with b > 2, let f:Z* > R* U {0}. If
                                       f()<ec,        and

f(n) <af(n/b) +c,              forn = b*,         k>1,
                 then foralln   = 1, b, b*, b°,...,

1) f € O(log, n), whena = 1, and
                     2) f € O(n'%*), whena > 2.
                 Proof: Consider the function g: Z* —> R* U {0}, where
                                       g(1)=c,        and
                                       g(n) = ag(n/b) +c,             forn €{1, b, b*,...}.
                    By Corollary 10.1,
                                    geO(log,n)         on    {b*|keN},       whena=1,           and
                                    ge O(n“)           on    {b¥|k EN},      whena>2.
                      We claim that f(n) < g(n) foralln € {1, b, b?, .. .}. To prove our claim, we induct onk
                  wheren = b*.Ifk = 0,thenn = b° = Land f(1) <c = g(1) —so the result is true for this
                 first case. Assuming the result
                                               is true forsomet € N, wehave f(n) = f(b’) < g(b') = g(n),
                 forn = b'. Then fork =t +1 andn = b* = b'*!” we find that
                        f(n) = f(b!) <af(b't'/b) +e = af (b') +e < ag(b') +¢ = g(b't') = g(n).
                 Therefore, it follows by the Principle of Mathematical Induction that f(n) < g(n) for all
                 neé{l,b, b?, ...}. Consequently, f € O(g) on {b*|k € N}, and the corollary follows be-
                 cause of our earlier statement about g.

Up to this point, our study of divide-and-conquer algorithms has been predominantly
                 theoretical. It is high time we gave an example in which these ideas can be applied. The
                 following result will confirm one of our earlier examples.
500           Chapter 10 Recurrence Relations

For n = 1, 2, 4, 8, 16,..., let f(#) count the number of comparisons needed to find the
      EXAMPLE 10.48
                               maximum and minimum elements in a set S$ C R, where |S| =n and the procedure in
                               Example 10.30 is used.
                                  If mn = 1, then the maximum and minimum elements are the same element. Therefore,
                               no comparisons are necessary and f(1) = 0.
                                  Ifn > 1, thenn = 2* for somek € Z*, and we partition S as S$; U S; where |S;| = |S2| =
                               n/2 = 2‘! It takes f(n/2) comparisons to find the maximum M; and the minimum m, for
                               each set S,,i = 1, 2. Forn > 4, knowing m;, M), m2, and M>, we then compare my, with
                               m, and M, and M) to determine the minimum and maximum elements in S. Therefore,
                                                   f(n) = 2f(n/2) + 1,          whenn
                                                                                    = 2,        and
                                                   f(@) =2f(n/2) +2,            whenn = 4, 8, 16,....
                                   Unfortunately, these results do not provide the hypotheses of Theorem 10.1. However,
                               if we change our equations into the inequalities

fC) <2
                                                   f(n) <2f(n/2) +2,            forn = 24,        k>1,
                               then by Corollary 10.2 the time-complexity function f(m), measured by the number of
                               comparisons made in this recursive procedure, satisfies f € O(n'°82*) = O(n), forall n =
                               1,2,4,8,....
                                  We can examine the relationship between this example and Example 10.30 even further.
                               From that earlier result, we know thatif |S| = n = 2*,k > 1, then the number of comparisons
                               f(n) we need (in the given procedure) to find the maximum and minimum elements in S is
                               (3/2)(2*) — 2. (Note: Our statement here replaces the variable n of Example 10.30 by the
                               variable k.)
                                  Since n = 2*, we find that we can now write

fC) =0
                                       f(n) = f(2*) = 3/2)2*) —2 = 3/2)n—-2,             — forn = 2,4, 8, 16,....
                                  Hence f € O(n) for n € {2*|k € N}, just as we obtained above using Corollary 10.2.

All of our results have required that n = b*, for some k EN, so it is only natural to ask
                               whether we can do anything in the case where n is allowed to be an arbitrary positive integer.
                               To find out, we introduce the following idea.

Definition 10.2         A function f: Z* — Rt U {0} is called monotone increasing if forallm,n €Zt,m<n=>
                               f(m) < ft).

This permits us to consider results for all n € Z* — under certain circumstances.

THEOREM 10.2                   Let f: Z* — Rt U {0} be monotone increasing, and let g:Z* > R. For be Zt, b > 2,
                               suppose that f € O(g) forall n € S = {b*|k € N}. Under these conditions,

a) If g € O(log n), then f € O(log n).
                                 b) Ifg € Ot logn), then f € O(m logn).
                                  c) Ifg € O(n’), then f € O(n’), forr € Rt U {0}.
                                                         10.6 Divide-and-Conquer Algorithms (Optional)         501

Proof: We shall prove part (a) and leave parts (b) and (c) for the Section Exercises. Before
                starting, we should note that the base for the logarithms in parts (a) and (b) is any positive
                real number greater than 1.

Since f € O(g) on S, and g € O(log n), we at least have f € O(log n) on S. Therefore,
                by Definition 10.1, there exist constants me R* and s € Z* such that f(n) = | f(n)| <
                m|logn| = mlogn for all n € S,n > s. We need to find a constant M € R* such that
                f(n) < M logn for all n > 5, not just those n € S.
                   First let us agree to choose s large enough so that log s > 1. Now let n € Z*, where
                n> sbutn ¢ S.Then there exists k € Z* such thats < b* <n < b‘t!. Since f is monotone
                increasing and positive,

f(n) < f(b!) < m log(b**!) = mflog(b*) + log 5]
                                                      = mlog(b*) + m log b
                                                      < mlog(b*) + m log b log(b*)
                                                      = m(1 + log b) log(b*)
                                                      < m(1 + log b) log n.

So with M = m(1 + log b) we find that for alln € Z* — S, ifn > s then f(n) < M logan.
                Hence f(n) < M logn for all n € Z*, where n > s, and f € O(log n).

We shall now use the result of Theorem 10.2 in determining the time-complexity function
                f (n) for a searching algorithm known as binary search.
                   In Example 5.70 we analyzed an algorithm wherein an array a1, a2, 43, .. . ,          n of inte-
                gers was searched for the presence of a particular integer called key. At that time the array
                entries were not given in any particular order, so we simply compared the value of key with
                those of the array elements aj, a2, a3, ..., Gn. This would not be very efficient, however,
                if we knew that a, < az < a3 <--- < ay. (After all, one does not search a telephone book
                for the telephone number of a particular person by starting at page 1 and examining every
                name in succession. The alphabetical ordering of the last names is used to speed up the
                searching process.) Let us look at a particular example.

Consider the array @), a2, 43,..., @7 of integers, where a, = 2, a) = 4, a3 = 5, a4 = 7,
EXAMPLE 10.49
                as = 10, a6 = 17, and a7 = 20, and let key = 9. We search this array as follows:

1) Compare key with the entry at the center of the array; here it is ag = 7. Since key >
                      a4, We now concentrate on the remaining elements in the subarray as, a6, a7.
                   2) Now compare key with the center element ag. Since key = 9 < 17 = ag, we now turn
                      to the subarray (of as, a, a7) that consists of those elements smaller than a¢. Here
                      this is only the element as.
                   3) Comparing key with as, we find that key # as, so key is not present in the given array
                      @|,   42,   03,...,   a7.

From the results of Example     10.49, we make     the following observations for a general
                (ordered) array of integers (or real numbers). Let a), a2, a3, . . . , @, denote the given array,
502   Chapter 10 Recurrence Relations

and let key denote the integer (or real number) for which we are searching. Unlike our array
                        in Example 5.70, here
                                                               a,   <d2<
                                                                      43 < ++: <dy.

1) First we compare the value of key with the array entry at or near the center. This entry
                               iS G(n41)/2 for n odd or an/2 for n even,
                                   Whether x is even or odd, the array element subscripted by c = |(n + 1)/2| is the
                               center, or near center, element. Note that at this point | is the value of the smallest
                               subscript for the array subscripts, whereas n is the value of the largest subscript.
                            2) If key is a,, we are finished. If not, then
                                  a) If key exceeds a,, we search (with this dividing process) the subarray a4,
                                        Aci2;   seg   Ay.

b) If key is smaller than a,, then the dividing process is applied in searching the
                                        subarray @, 42,...,Qe-1.

The preceding observations have been used in developing the pseudocode procedure in
                        Fig. 10.26. Here the input is an ordered array a), a2, 43, .. - , @, of integers, or real numbers,
                        in ascending order, the positive integer n (for the number of entries in the given array), and
                        the value of the integer variable key. If the array elements are integers (real numbers), then
                        key should be an integer (real number). The variables s and / are integer variables used for
                        storing the smallest and largest subscripts for the subscripts of the array or subarray being
                        searched. The integer variable c stores the index for the array (subarray) element at, or near,
                        the center of the array (subarray). In general, c = |(s + /)/2]. The integer variable /ocation
                        stores the subscript of the array entry where key is located; the value of location is 0 when
                        key is not present in the given array.

procedure        BinarySearch(n:             positive   integer;   key, a),a,a3,...,a,:       integers)
               begin
               gs:=1        {sis the smallest subscript of the subarray being searched}
               l:=n         {1 isthe largest subscript of the subarray being searched}
               location :=0
               while s </do
                 begin
                    c:=|[(s+1)/2]
                    if key = a, then
                       begin
                          location :=c
                          s:=/+1
                       end
                    else if key < a, then
                           l:=sc-1
                     elses      :=cil
                  end
               end

Figure 10.26

We want to measure the (worst-case) time complexity for the algorithm implemented
                        in Fig. 10.26. Here f(n) will count the maximum number of comparisons (between key
                                        10.6 Divide-and-Conquer Algorithms (Optional)     503

and a.) needed to determine whether the given number key appears in the ordered array
GQ), A2, A3,..., ay.

® Forn = 1, key is compared to a, and f(1) = 1.
  @ When n = 2, in the worst case key is compared to a, and then to a2, so f (2) = 2.
     In the case ofn = 3, f (3) = 2 (in the worst case).
  @ When n = 4, the worst case occurs when key is first compared to a, and then a binary
  search of a3, a4 follows. Searching a3, a4 requires (in the worst case) f (2) comparisons.
   So f(4) =1+4        fQ) =3.
   At this point we see that f(1) < f(2) < f(3) < f(4), and we conjecture that f is a
monotone increasing function. To verify this, we shall use the Principle of Mathemati-
cal Induction in its alternative form. Here we assume that for all 7, 7 € {1, 2,3,..., 7},
i<j=       fi) < fC). Now consider the integer n + 1. We have two cases to examine.

1) n + 1    is odd: Here we write n = 2k andn+        1=2k+41,      for some k € Z*. In the
      worst case, f(n +1) = f(2k +1) = 1+ f(k), where 1 counts the comparison of
      key with a,,,, and f(k) counts the (maximum) number of comparisons needed in a
      binary search of the subarray a), a2, ..., ag or the subarray ay42, @e43,.--, Gak4t-
         Now f(n) = f(2k) = 1+ max{f(k — 1), f(k)}. Since k — 1, k <n, by the in-
      duction hypothesis we have f(k — 1) < f(k), so f(v) = 14+ f(k) = f+).
   2) n+      1 is even: At this time we have n + 1 = 2r, for some r € Z™, and in the worst
      case, f(n + 1) = 1+ max{f(r — 1), f(r)} = 1+ f(), by the induction hypothesis.
      Therefore,

f(y = f2r-l=1+fr-Ds1l+fO=fartdb.
      Consequently, the function f is monotone increasing.

Now it is time to determine the worst-case time complexity for the binary search algo-
rithm, using the function f(n). Since

fQ)=1,      and
                        f(n) = f(n/2) +1,        forn = 2*,          k>1,
it follows from Theorem 10.1 (with a = 1, b = 2, and c = 1) that

f(n)=log,n+i1,         and   f € O(log, n)         forn € {1, 2,4, 8,...}.

But with f monotone increasing, from Theorem 10.2 it now follows that f € O(log, n) (for
all n € Zt), Consequently, binary search is an O(log, n) algorithm, whereas the searching
algorithm of Example 5.70 is O(n). Therefore, as the value of n increases, binary search
is the more efficient algorithm — but then it requires the additional condition that the array
be ordered.

This section has introduced some of the basic ideas in the study of divide-and-conquer
algorithms. It also extends the material first introduced on computational complexity and
the analysis of algorithms in Sections 5.7 and 5.8.
   The Section Exercises include some extensions of the results developed in this section.
The reader who wants to pursue this topic further should find the chapter references both
helpful and interesting.
504             Chapter 10 Recurrence Relations

8. a) Modify the procedure in Example 10.48 as follows: For
                            EXERCISES 10.6                                      any §S CR, where |S| =n, partition S as S$, U S;, where
                                                                                |S)| = |S>|, form even, and |S,| = 1 + |S>|, form odd. Show
  1. In each of the following, f: Z* — R. Solve for f (n) rela-                 that if f(m) counts the number of comparisons needed (in
tive to the given set §, and determine the appropriate “big-Oh”                 this procedure) to find the maximum and minimum ele-
form for f on S.                                                                ments of S, then f is a monotone increasing function.
      a) f() =5                                                                 b) What is the appropriate “‘big-Oh” form for the function
         f(n) =4f(n/3) +5,               n=3,9,27,...                           f of part (a)?
         S = {3'|i EN}
                                                                             9. In Corollary     10.2 we   were   concerned      with finding the
      b) fC) =7                                                             appropriate “big-Oh” form for a function f: Z* > R* U {0}
         f(n) = f(n/S)+7,            n=5,25,125,...                         where
         S = {5'|i EN}                                                          fd) <c,           forc
                                                                                                     € Zt
2. Let a, b,c € Z* with b > 2, and let d € N. Prove that the
                                                                                f(a) <af(n/b) +c,
solution for the recurrence relation
                                                                                       fora, be Z* withhb>2,            andn=b',k eZ.
            fdjy=d
                                                                            Here the constant ¢ in the second inequality is interpreted as
            f(n) = af(a/b)+e,               n=b*,        k>1
                                                                            the amount of time needed to break down the given problem
satisfies                                                                   of size n into a smaller (similar) problems of size n/b and to
      a) f(n)=d+clog,n, forn = b*,k EN, whena = 1,                          combine the a solutions of these smaller problems in order to
      b) f(n) = dn! + (c/(a — 1))[n'4 — 1], for n = dF,                     get a solution for the original problem of size n. Now we shall
      k €N, whena > 2.                                                      examine a situation wherein this amount of time is no longer
                                                                            constant but depends on n.
3. Determine        the   appropriate    “big-Oh"   forms   for   f   on
{b*|k € N} in parts (a) and (b) of Exercise 2.                                  a) Leta, b,c € Z*, with b > 2. Let f: Z* > R* U {0} be
                                                                                a monotone increasing function, where
  4. In each of the following, f: Z* > R. Solve for f (a) rela-
tive to the given set S, and determine the appropriate “‘big-Oh”                    fd)se
form for f on S.                                                                    f(n) <af(n/b) +n,               forn = b*,         keZ.
      a) f(1) =90                                                                   Use an argument similar to the one given (for equalities)
         f(r) = 2f(n/5)+3,               n=5, 25, 125,...                       in Theorem 10.1 to show that for all n = 1, b, b?, b°,...,
         S = {5'ji EN}                                                                                        k
      b) fC) =1                                                                                      fn) <cn S(a/by'.
         f(n) = f(n/2) +2,               n=2,4,8,...                                                              :=0

S = {2'|i EN}                                                          b) Use the result of part (a) to show that f € O(n log jn),
                                                                                where a = b. (The base for the log function here is any real
  5. Consider a tennis tournament for » players, where n = 2
                                                                                number greater than 1.)
k €Z*. In the first round 2/2 matches are played, and the n/2
winners advance to round 2, where n/4 matches are played.                       c) Whena       # b, show that part (a) implies that
This halving process continues until a winner is determined.
                                                                                               f(n)< (“   ) (att! — pet),
      a) Forn = 2°, k € Z*, let f(n) count the total number of                                        a—b
      matches played in the tournament. Find and solve a recur-                 d) From part (c), prove that (i) f € O(n), whena < b; and
      rence relation for f (7) of the form                                      (ii) f € O(n'®>*), when a > b. [Note: The “big-Oh” form
              fG)=d                                                             forf here and in part (b) is for f on Z*, not just {b*|k € N}.]

n=2,4,8,...,                  10. In this exercise we briefly introduce the Master Theorem.
              f(n) = af (n/2) +e,
                                                                            (For more on this result, including a proof, we refer the reader
      where a, c, and d are constants.
                                                                            to pp. 73-84 of reference [5] by T. H. Cormen, C. E. Leiserson,
      b) Show that your answer in part (a) also solves the recur-           R. L. Rivest, and C. Stein.)
      rence relation                                                            Consider the recurrence relation
             f()=d                                                                               fC) = 4,
             f(r) = f(a/2)
                       + (@/2).                 n=2,4,8,....                                     f(n) = af (n/b) + h(n),
  6. Complete the proofs for Corollary          10.1 and parts (b) and      whereneZ',n>laeZt,a<n,andbeRt, 1 <b<n,
(c) of Theorem 10.2.                                                        The function A accounts for the time (or cost) of dividing the
7. What is the best-case time-complexity function for binary               given problem of size n into a smaller (similar) problems of
search?                                                                     size approximately n/b and then combining the results from
                                                                                      10.7 Summary and Historical Review                 505

the a smaller problems. Further, there exists k ¢ Z* such that           1) f(a) = 16fi(n/4) +n
h(n) > O for all n > k. (Since n/b need not be an integer, the           Here a = 16, b = 4, n'84        = plots © = yp? and h(n) = 21.
recurrence relation is not properly defined. To get around this          So h € O(n'°8s !®-*) with € = 1. Consequently, h falls un-
we need to replace n/b by either [n/b| orfn/b]. But as this              der the hypothesis for case (i) and it follows that f € O(n’).
does not affect the outcome of the result, for large values of n,
we shall not concern ourselves with such details.)
                                                                         2) f(n) = fBn/4) +5
                                                                         Now we have a = 1, b = 4/3, n'0® 4 = n'843! = n° = 1,
    Under the above hypothesis we find the following [where ©            and h(n) = 5. Consequently, 2 € @(n'°83') and from case
(big theta) and &2 (big omega) are as given in Exercises 11-16           (ii) we learn that f ¢ O(n'®4 ' log, n) = O(log, 7).
for Section 5.7]:
                                                                        3) f(a) =7f(a/8) +n log, a
      i)   If A € O(n'4-£),    for some   fixed € > 0, then f €          For   this   recurrence   relation   we   have     a=7,        b=8,
            @(niee 4);                                                  n'o8p 4 = poss? = n°          and     A(n)=nlog,n.         So     he
     ii) Ifh € O(n'®*), then f € O(n'? log, n); and                     Q (nies 7+)     where    e€ =0.064>0.   Further, for all
    iii)   If A € Q(n'%4+*) for some fixed € > 0, and if                sufficiently   large  an, ah(n/b) = 7(n/8) log,(n/8) =
           ah(n/b) <c h(n), for some fixed c, where 0 < ¢ < 1,          (7/8)n[log, n — log, 8) < (7/8)n log, n =c h(n),     for
           and for all sufficiently large n, then f € O(h).             0<c=7/8 < 1. Thus, A satisfies the hypotheses for case
                                                                        (ili) and we have f € O(n log, 7).
    In all three cases, the function h is compared with n!°&
and, roughly speaking, the Master Theorem then determines the            Use the Master Theorem to determine the complexity of f
complexity of the solution f(m) as the larger of the two func-       in each of the following, where f (1) = 1:
tions in cases (i) and (iii), while in case (ii) we find the added
factor log, n. However, it is important to realize that there are
                                                                         a) f(n)=9f(n/3)+n — db) f(r) =2f(n/2) +1
some recurrence relations of this type that do not fall under any        c) fin) = fn/3)+1    d) f(a) =2f(n/3)+a
of these three cases.                                                    e) f(n) =4f(n/2) +n?
    For now we consider the following, where f(1) = | for all
three examples.

10.7
           Summary and Historical Review
                                In this chapter the recurrence relation has emerged as another tool for solving combinatorial
                                problems. In these problems we analyze a given situation and then express the result a, in
                                terms of the results for certain smaller nonnegative integers. Once the recurrence relation
                                is determined,    we can solve for any value of a, (within reason). When                  we have access
                                to a computer, such relations are particularly valuable, especially if they cannot be solved
                                explicitly.
                                    The study of recurrence relations can be traced back to the Fibonacci relation Fy,42 =
                                Fait + Fy,n = 0, Fo = 0, F; = 1, which was given by Leonardo of Pisa (c. 1175—1250) in
                                1202. In his Liber Abaci, he deals with a problem concerning the number of pairs of rabbits
                                that result in one year if one starts with a single pair that produces another pair at the end
                                of each month. Each new pair starts to breed similarly one month after its birth, and we
                                assume that no rabbits die during the given year. Hence, at the end of the first month there
                                are two pairs of rabbits; three pairs after two months; five pairs after three months; and so
                                on. [As mentioned in the summary of Chapter 9, Abraham DeMoivre (1667-1754) obtained
                                this result by the method of generating functions in 1718.] This same sequence appears in
                                the work of the German mathematician Johannes Kepler (1571-1630), who used it in his
                                studies on how the leaves of a plant or flower are arranged about its stem. In 1844 the
                                French mathematician Gabriel Lamé (1795-1870) used the sequence in his analysis of the
                                efficiency of the Euclidean algorithm. Later, Frangois Edouard Anatole Lucas (1842-1891),
                                who popularized the Towers of Hanoi puzzle, derived many properties of this sequence and
                                was the first to call these numbers the Fibonacci sequence.
506   Chapter 10 Recurrence Relations

Leonardo Fibonacci (c. 1175-1250)
                                                 Reproduced courtesy of The Granger Collection, New York

For an elementary coverage of examples and properties for the Fibonacci numbers one
                       should examine the book by T. H. Garland [10]. Even more can be learned from the texts by
                       V. E. Hoggatt, Jr. [14] and S. Vajda [29]. The UMAP article by R.V. Jean [16] gives many
                       applications of this sequence. Chapter 8 of the mathematical exposition by R. Honsberger
                       [15] provides an interesting account of the Fibonacci numbers and of the related sequence
                       called the Lucas numbers. The text by R. L. Graham,                D. E. Knuth, and O. Patashnik [12]
                       also includes many interesting examples and properties of both the Fibonacci numbers
                       and the Catalan numbers. More counterexamples for the Fibonacci and Catalan numbers,
                       like those found in Examples 10.17 and 10.44, respectively, can be found in the article by
                       R. K. Guy [13]. Additional material on the role of the golden ratio in such areas as geometry,
                       probability, and fractals is given in the book by H. Walser [30]. The book by T. Koshy [19]
                       provides a definitive history and extensive analysis of the Fibonacci and Lucas numbers,
                       together with a wide variety of applications, examples, and exercises.
                           Comparable coverage of the material presented in this chapter can be found in Chapter 3
                       of C. L. Liu [21]. For more on the theoretical development of linear recurrence relations
                       with constant coefficients, examine Chapter 9 of N. Finizio and G. Ladas [8].
                          Applications in probability theory dealing with recurrent events, random walks, and ruin
                       problems can be found in Chapters XTII and XIV of the classic text by W. Feller [7]. The
                       UMAP     module   by D. R. Sherbert      [24] introduces difference equations         and includes an
                       application in economics known as the Cobweb Theorem. The text by S. Goldberg [11] has
                       more on applications in the social sciences.
                           Recursive techniques in the generation of permutations and combinations are developed
                       in Chapter 4 of R. A. Brualdi [3]. The algorithm presented in Section 10.1 for the permu-
                       tations of {1, 2, 3,..., } first appeared in the work of H. D. Steinhaus [27] and is often
                       referred to as the adjacent mark ordering algorithm. This result was rediscovered later,
                       independently by H. F. Trotter [28] and S. M. Johnson [17]. Efficient sorting methods for
                       permutations and other combinatorial structures are analyzed in the text by D. E. Knuth
                       [18]. The work of E. M. Reingold, J. Nievergelt, and N. Deo [23] also deals with such
                       algorithms.
                           For those who enjoyed the rooted ordered binary trees in Section 10.5, Chapter 3 of
                       A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1] should prove interesting. The basis for the
                                                                                    References         507

example on stacks is given on page 86 of the text by S. Even [6]. The article by M. Gardner [9]
     provides other examples where the Catalan numbers arise. Computational considerations in
     determining Catalan numbers are examined in the article by D. M. Campbell [4]. Much more
     about the Catalan numbers can be found in the text by R. P. Stanley [26] —in particular, 66
     situations, where these numbers arise, are provided on pp. 219-229.
         Finally, the coverage on divide-and-conquer algorithms in Section 10.6 is modeled after
     D. F. Stanat and D. F McAllister’s presentation in Section 5.3 of [25]. Chapter 10 of the
     text by A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1] provides some further information
     on this topic. An application of this method in a matrix multiplication algorithm appears in
     Chapter 10 of the text by C. L. Liu [20]. Additional coverage and a proof for the Master
     Theorem are given in Chapter 4 of the text by T. H. Cormen, C. E. Leiserson, R. L. Rivest,
     and C. Stein [5].

REFERENCES
          1. Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffery D. Data Structures and Algorithms.
             Reading, Mass.: Addison-Wesley, 1983.
         2. Auluck, F. C. “On Some New Types of Partitions Associated with Generalized Ferrers Graphs.”
             Proceedings of the Cambridge Philosophical Society 47 (1951): pp. 679-685.
         3. Brualdi, Richard A. Introductory Combinatorics, 3rd ed. Upper Saddle River, N.J.: Prentice-
             Hall, 1999.
          4, Campbell, Douglas M. “The Computation of Catalan Numbers.” Mathematics Magazine 57,
             no. 4 (September 1984): pp. 195-208.
          5. Cormen, Thomas H., Leiserson, Charles E., Rivest, Ronald L., and Stein, Clifford. Introduction
             to Algorithms, 2nd ed. Boston, Mass.: McGraw-Hill, 2001.
          6. Even, Shimon. Graph Algorithms. Rockville, Md.: Computer Science Press, 1979.
          7. Feller, William. An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed.
             New York: Wiley, 1968.
          8. Finizio, N., and Ladas, G. An Introduction to Differential Equations. Belmont, Calif.:
             Wadsworth Publishing Company, 1982.
          9. Gardner, Martin. “Mathematical Games, Catalan Numbers: An Integer Sequence that Materi-
             alizes in Unexpected Places.” Scientific American 234, no. 6 (June 1976): pp. 120-125.
        10. Garland, Trudi Hammel. Fascinating Fibonaccis. Palo Alto, Calif.: Dale Seymour Publica-
            tions, 1987.
        11. Goldberg, Samuel. Introduction to Difference Equations. New York: Wiley, 1958.
        12. Graham, Ronald Lewis, Knuth, Donald Ervin, and Patashnik, Oren. Concrete Mathematics,
            2nd ed. Reading, Mass.: Addison-Wesley, 1994.
        13. Guy, Richard K. “The Second Strong Law of Small Numbers.” Mathematics Magazine 63,
             no. | (February 1990): pp. 3-20.
        14, Hoggatt, Verner E., Jr. Fibonacci and Lucas Numbers. Boston, Mass.; Houghton Mifflin, 1969.
        15. Honsberger, Ross. Mathematical Gems II (The Dolciani Mathematical Expositions, Number
             Nine). Washington, D.C.: The Mathematical Association of America, 1985.
        16. Jean, Roger V. “The Fibonacci Sequence.” The UMAP Journal 5, no. 1 (1984): pp. 23-47.
        17, Johnson, Selmer M. “Generation of Permutations by Adjacent Transposition.” Mathematics
             of Computation 17 (1963): pp. 282-285.
        18. Knuth, Donald E. The Art of Computer Programming/Volume 3 Sorting and Searching. Read-
             ing, Mass: Addison-Wesley, 1973.
         19. Koshy, Thomas. Fibonacci and Lucas Numbers with Applications. New York: Wiley, 2001.
        20. Liu, C. L. Elements of Discrete Mathematics, 2nd ed. New York: McGraw-Hill, 1985.
        21. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
        22. Miksa, F. L., Moser, L., and Wyman, M. “Restricted Partitions of Finite Sets.” Canadian
             Mathematics Bulletin | (1958): pp. 87-96.
508             Chapter 10 Recurrence Relations

23. Reingold, E. M., Nievergelt, J., and Deo, N. Combinatorial Algorithms: Theory and Practice.
                                           Englewood Cliffs, N.J.: Prentice-Hall, 1977.
                                       24. Sherbert, Donald R. Difference Equations with Applications, UMAP Module 322. Cambridge,
                                           Mass.: Birkhauser Boston, 1980.
                                       25. Stanat, Donald F., and McAllister, David F. Discrete Mathematics in Computer Science. En-
                                           glewood Cliffs, N.J.: Prentice-Hall, 1977.
                                       26. Stanley, Richard P. Enumerative Combinatorics, Vol. 2. New York: Cambridge University
                                           Press, 1999.
                                       27. Steinhaus, Hugo D. One Hundred Problems in Elementary Mathematics. New York: Basic
                                            Books,   1964.
                                       28. Trotter, H. F. “ACM Algorithm 115 — Permutations.” Communications of the ACM 5 (1962):
                                           pp. 434-435.
                                       29. Vajda, S. Fibonacci & Lucas Numbers, and the Golden Section. New York: Halsted Press (a
                                           division of John Wiley & Sons), 1989.
                                       30. Walser, Hans. The Golden Section. Washington, D.C.: The Mathematical Association of Amer-
                                            ica, 2001.

a) Compute M?, M°, and M+.
            SUPPLEMENTARY EXERCISES                                          b) Conjecture a general formula for M", n € Z*, and es-
                                                                             tablish your conjecture by the Principle of Mathematical
                                                                             Induction.
1. For ne Z*                and n>k+1>1,    verify algebraically the
recursion formula                                                         7. Determine the points of intersection of the parabola y =
                                                                         x’ — 1 and the hyperbola y = 1+ +.

(. ) - (731) (:)                             8. Leta = (1 + /5)/2 and B = (1 — J5)/2.

2. a) For n> 0, let B, denote the number of partitions of
                                                                             a) Verify thata? =a +1 and B? = 6B +1.
      {1,2,3,..., nj}. Set By = 1 for the partitions of %. Verify            b) Prove that for alln > 0, oy _y (2)Fu = Fun.
      that for all x > 0,                                                    c) Show that a? = 1+ 2@      and 6? = 1 + 26.

Bra =o (,0,)8 => (7)e.                                      d) Prove that for all n > 0, )oy-5 (f)2* Fe = Fin.
                                                                          9, a) For a = (1+ J5)/2, verify that a2? + 1 = 2+             and
                                                                             (2+a)* = 5a’,
      [The numbers B,, i > 0, are referred to as the Bell numbers
                                                                             b) Show that for 6 = (1 — /5)/2, 6? +1=246                 and
      after Eric Temple Bell (1883-1960).]}
                                                                             (2+ By = 5p’.
      b) How are the Bell numbers related to the Stirling num-
                                                                             ¢) Ifn, m €N prove that
      bers of the second kind?
                                                                                               2n
3. Letn, k € Z*, and define p(n, k) to be the number of par-
titions of n into exactly k (positive-integer) summands. Prove                                Yo G2) Paci = 5" Fanim:
                                                                                              k=0
that p(n, k) = p(n —1,k —1)+ p(n —k, k).
                                                                        10. Renu wants to sell her laptop for $4000. Narmada offers to
  4. For n > 1, let a, count the number of ways to write n as
                                                                        buy it for $3000. Renu then splits the difference and asks for
an ordered sum of odd positive integers. (For example, a4 =
                                                                        $3500. Narmada likewise splits the difference and makes a new
3sincee4=3+1=14+3=1+4+1+1+41.) Find               and solve a
                                                                        offer of $3250. (a) If the women continue this process (of ask-
recurrence relation for a,.
                                                                        ing prices and counteroffers), what will Narmada be willing to
                     i       1                                          pay on her 5th offer? 10th offer? kth offer, k > 1? (b) If the
5. Let
      A =       k            a:
                                                                        women continue this process (providing many, many new ask-
      a) Compute A’, A®, and A‘.                                        ing prices and counteroffers), what price will they approach?
                                                                        (c) Suppose that Narmada was willing to buy the laptop for
      b) Conjecture a general formula for A", n € Z*, and es-
                                                                        $3200. What should she have offered to pay Renu the first time?
      tablish your conjecture by the Principle of Mathematical
      Induction.                                                        11. Parts (a) and (b) of Fig. 10.27 provide the Hasse diagrams
                         1    ]                                         for two partial orders referred to as the fences #5, #e [on 5, 6
6. Let
      M =        i            >
                                                                        (distinct) elements, respectively}. If, for instance, R denotes the
                                                                                                                                                           Supplementary Exercises                   509

partial order for the fence #5, then a; Ray, a3 R ay, ax Rag,                                                                forn > 2, verify that f(x) = (e-*)/(i — x). Hence
and a; R a4. For each such fence ¥,, n > 1, we follow the
convention that an element with an odd subscript is minimal
and one with an even subscript is maximal. Let ({1, 2}, <) de-
note the partial order where < denotes the usual “less than or                                                          16. For n > 0, draw n ovals in the plane so that each oval in-
equal to” relation. As in Exercise 26 of Section 7.3, a func-                                                           tersects each of the others in exactly two points and no three
tion f:%, — {1, 2} is called order-preserving when for all                                                              ovals are coincident. If a, denotes the number of regions in the
x, VER, xRy => f(x) < f(y). Let c, count the number of                                                                  plane that results from these n ovals, find and solve a recurrence
such order-preserving functions. Find and solve a recurrence                                                            relation for a,.
relation for c,.                                                                                                        17. For     > 0, let us toss a coin 2” times.
                                                                                                                             a) If a, is the number of sequences of 2m tosses where n
                    a2                    ag                         by                   Da           bg                    heads and » tails occur, find a, in terms of n.
                                                                                                                             b) Find constants r, s, and t so that (r + sx)' = f(x) =
                                                                                                                             yo     an x”.
                                                                                                                             c) Let b, denote the number of sequences of 2n tosses
                                                                                                                             where the numbers of heads and tails are equal for the first
           ay                  a3                 a5           b,              b3               Ds
                                                                                                                             time only after all 2” tosses have been made. (For example,
      (a)                     Bs                               (b)                   86                                      if n = 3, then HHHTTT               and HHTHTT        are counted in b,,
                                                                                                                             but HTHHTT and HHTTHT are not.)
     Figure 10.27                                                                                                                Define b) = 0 and show that for all n > 1,

a,    =   agb,   +   aby)     tee   +    a,   D,   +   a, bo.
12. For         n>O0,               let        m=          [(n+1)/2].           Prove           that        F,,.=
Yeey ("£7 '). (You may want to look back at Examples 9.17                                                                    d) Let g(x) = 09 b,x”. Show that g(x) = 1 — 1/f (x),
and 10.11.)                                                                                                                  and then solve for b,, 2 > 1.

13. a) For n € Z*, determine the number of ways one can tile                                                            18. For a = (1+ J/5)/2              and B = (1 — JV5)/2, show -that
    a 1 Xn chessboard using | X 1 white (square) tiles and                                                              yy     BY = ~B =a — 1 and that °°, [Blk = a2.
    1 X 2 blue (rectangular) tiles.                                                                                     19. Let a, b, c be fixed real numbers with ab = 1 and let
    b) How many of the tilings in part (a) use (i) no blue tiles;                                                       f:RXR-—R be the binary operation, where f(x, y) =a+
    (ii) exactly one blue tile; (iii) exactly two blue tiles; (iv) ex-                                                  bxy + c(x + y). Determine the value(s) of c for which f will
    actly three blue tiles; and (v) exactly & blue tiles, where                                                         be associative.
    O<k <[n/2}?                                                                                                         20. a) For w = (1+ J5)/2 and f = (1 — J5)/2, verify that
                                                                                                                             o—a?*%=a—fp=                  fp? -   p’.
    c) How are the results in parts (a) and (b) related?
                                                                                                                             b) Prove that Fy, = F2,, —              F2_,,n 21.
14, Lete=           y1         +71             +J/1+/1+-.--.
                                                        How is c? related to                                                 c) Forn > 1, let T be an isosceles trapezoid with bases of
c? What is the value of c?                                                                                                   length F,_; and F,,,), and sides of length F,,. Prove that the
                                                                                                                             area of T is (13/4) Foy. [Note that, when n = 1, the trape-
15. For n € Z*, d, denotes the number of derangements                                                              of
                                                                                                                             zoid degenerates into a triangle. However, the formula is
{1,2,3,..., nm}, as discussed in Section 8.3.
                                                                                                                             still correct.]
    a) If n > 2, show that d, satisfies the recurrence relation
                                                                                                                        21. Let ¥ be the sample space for an experiment ©. If A, B are
                                                                                                                        events from & with AUB = %, AN B =, Pr(A) = p, and
     d,,        =   (n    a         1)(d,-1         +      d,-2),         dy        =      1,          d    = 0.
                                                                                                                        Pr(B) = p’, determine p.
    b) How can we define dp so that the result in part (a) is                                                           22. De’ Jzaun and Sandra toss a loaded coin, where Pr(H) =
    valid for n > 2?                                                                                                    p > 0. The first to obtain a head is the winner. Sandra goes first
    c) Rewrite the result in part (a) as                                                                                but, if she tosses a tail, then De’Jzaun             gets two chances. If he
                                                                                                                        tosses two tails, then Sandra again tosses the coin and, if her
                         d,        ~~     ndy—|        =    —[dy-|   a    (n    ~~      1)d,_2}.                        toss is a tail, then De’Jzaun again goes twice (if his first toss is
                                                                                                                        a tail). This continues until someone tosses a head. What value
    How can d, — nd,_, be expressed in terms of d,_2, d,_3?                                                             of p makes this a fair game (that is, a game where both Sandra
    d) Show that d, — nd,_,; = (-1)".                                                                                   and De’Jzaun have probability 5 of winning)?
    e) Let f(x) = oy (d,x")/n!. After multiplying both                                                                  23. Forn > 1, leta, countthe number of binary strings of length
    sides of the equation in part (d) by x"/n! and summing                                                              n, where there is no run of 1’s of odd length. Consequently,
510            Chapter 10 Recurrence Relations

when # = 6, for instance, we want to include the strings 110000       they play until one of them is broke, what is the probability that
(which has a run of two 1’s and a run of four 0’s) and 011110         Cathy gets wiped out?
(which has two runs of one 0 and one run of four 1’s), but we
                                                                      29. For n,m € Z*, let f (nm, m) count the number of partitions
do not include either 100011 (which starts with a run of one 1)
                                                                      of n where the summands form a nonincreasing sequence of
or 110111 (which ends with a run of three 1’s). Find and solve
                                                                      positive integers and no summand exceeds m. With n = 4 and
a recurrence relation for a,.
                                                                      m = 2, for example, we find that f(4, 2) = 3 because here we
24. Let a, b be fixed nonzero real numbers. Determine x,, if          are concerned with the three partitions
Xy = Xy-1%y-2, nN > 2,X) = a, xX) = dD.
25. a) Evaluate FR          — F,Fysy — F? forn = 0, 1, 2, 3.
                                                                         4=24+2,          4=24141,               4=1414141.
      b) From the results in part (a), conjecture a formula for           a) Verify that for alln,m €Z*,
      FO — FaFaai — F? forn eN.
                                                                                   f(n,m) = f(n-—m,m)+ fla,m—1).
      c) Establish the conjecture in part (b) using the Principle
                                                                          b) Write a computer program (or develop an algorithm) to
      of Mathematical Induction.
                                                                          compute f(n, m) forn,meZ*.
26. Let n € Z*. On a 1 X va chessboard two kings are called
nontaking, if they do not occupy adjacent squares. In how many            c) Write a computer program (or develop an algorithm) to
ways can one place 0 or more nontaking kings ona 1 X a chess-
                                                                          compute p(n), the number of partitions of a given positive
                                                                          integer n.
board?
27. a) For 1 <i <6, determine the rook polynomial r(C;, x)            30. Let A, B be sets with |A| =m      >n   = |B\, and let a(m, n)
    for the chessboard C, shown in Fig. 10.28.                        count the number of onto functions from A to B. Show that

b) For each rook polynomial in part (a), find the sum of the      atm, 1) = 1
      coefficients of the powers of x — that is, determine r(C,, 1)                        n—-1

for! <i <6.                                                       a(m,n)    = n™   -> ("acm         i),      whenm>n>         1.
                                                                                           i=]    \!
28. (Gambler’s Ruin) When Cathy and Jill play checkers, each
has probability 5 of winning. There is never a tie, and the games     31. When one examines the units digit of each Fibonacci num-
are independent in the sense that no matter how many games the        ber F,,, n > 0, one finds that these digits form a sequence that
girls have played, each girl still has probability ; of winning       repeats after 60 terms. [This was first proved by Joseph-Louis
the next game. After each game the loser gives the winner a           Lagrange (1736—1813).] Write a computer program (or develop
quarter. If Cathy has $2.00 to play with and Jill has $2.50 and       an algorithm) to calculate this sequence of 60 digits.

C,                 C;               Co

Figure 10.28
     PART

3
    GRAPH
THEORY AND
APPLICATIONS
An Introduction
to Graph Theory

Wi          this chapter we start to develop another major topic of this text. Unlike other areas
                     in mathematics, the theory of graphs has a definite starting place, a paper published
             in 1736 by the Swiss mathematician Leonhard Euler (1707-1783). The main idea behind
             this work grew out of a now-popular problem known as the seven bridges of Kénigsberg.
             We shall examine the solution of this problem, from which Euler developed some of the
             fundamental concepts for the theory of graphs.
                 Unlike the continuous graphs of early algebra courses, the graphs we examine here are
             finite in structure and can be used to analyze relationships and applications in many differ-
             ent settings. We have seen some examples of applications of graph theory in earlier
             chapters (3, 5-8, and 10). However, the development here is independent of these prior dis-
             cussions.

11.1
Definitions and Examples
             When we use a road map, we are often concerned with seeing how to get from one town
             to another by means of the roads indicated on the map. Consequently, we are dealing with
             two distinct sets of objects: towns and roads. As we have seen many times before, such sets
             of objects can be used to define a relation. If V denotes the set of towns and E the set of
             roads, we can define a relation ® on V by a % b if we can travel from a to b using only the
             roads in F. If the roads in £ that take us from a to b are all two-way roads, then we also
             have b & a. Should all the roads under consideration be two-way, we have a symmetric
             relation.
                 One way to represent a relation is by listing the ordered pairs that are its elements.
             Here, however, it is more convenient to use a picture, as shown in Fig. 11.1. This figure
             demonstrates the possible ways of traveling among six towns using the eight roads indicated.
             It shows that there is at least one set of roads connecting any two towns (identical or distinct).
             This pictorial representation is a lot easier to work with than the 36 ordered pairs of the
             relation &.
                 At the same time, Fig. 11.1 would be appropriate for representing six communication
             centers, with the eight “roads” interpreted as communication links. If each link provides
             two-way communication, we should be quite concerned about the vulnerability of center a
             to such hazards as equipment breakdown or enemy attack. Without center a, neither b nor
             c can communicate with any of d, e, or f.
                  From these observations we consider the following concepts.

513
514           Chapter 11   An Introduction to Graph Theory

Figure 11.1                                                               Figure 11.2

Definition 11.1           Let V bea finite nonempty set, andlet E C V X V. The pair (V, £) is then called a directed
                                graph (on V), or digraph' (on V), where V is the set of vertices, or nodes, and E is its set
                                of (directed) edges or arcs. We write G = (V, E) to denote such a graph.
                                    When there is no concern about the direction of any edge, we still write G = (V, E). But
                                now E£is      a set of unordered pairs of elements taken from V, and G              is called an undirected
                                graph,
                                   Whether G = (V, £)            is directed or undirected, we often call V the vertex set of G and
                                E the edge set of G.

Figure 11.2 provides an example of a directed graph on V = {a, b, c, d, e} with E =
                                {(a, a), (a, b), (a, d), (6, c)}. The direction of an edge is indicated by placing a directed
                                arrow on the edge, as shown here. For any edge, such as (b, c), we say that the edge is
                                incident with the vertices b, c; b is said to be adjacent to c, whereas c is adjacent from b.
                                In addition, vertex 5 is called the origin, or source, of the edge (b, c), and vertex c is the
                                terminus, or terminating vertex. The edge (a, a) is an example of a loop, and the vertex e
                                that has no incident edges is called an isolated vertex.
                                    An undirected graph is shown in Fig. 11.3(a). This graph is a more compact way of
                                describing the directed graph given in Fig. 11.3(b). In an undirected graph, there are undi-
                                rected edges suchas {a, b}, {b, c}, {a, c}, {c, d} in Fig. 11.3(a). An edge such as {a, b} stands
                                for {(a, 5), (b, a)}. Although (a, b) = (b, a) only whena               = Bb, we do have {a, b} = {b, a}

d                                    d
                                                    (a)                                  (b)
                                                   Figure 11.3

* Since the terminology of graph theory is not standard, the reader may find some differences between terms
                                used here and in other texts.
                                                                             11.1 Definitions and Examples      515

for any a, b. We can write {a, a} to denote a loop in an undirected graph, but {a, a} is
                   considered the same as (a, a).
                       In general, if a graph G is not specified as directed or undirected, it is assumed to be
                   undirected. When it contains no loops it is called loop-free.

In the next two definitions we shall not concern ourselves with any loops that may be
                   present in the undirected graph G.

Definition 11.2    Let x, y be (not necessarily distinct) vertices in an undirected graph G = (V, E). An x-y
                   walk in G is a (loop-free) finite alternating sequence

X=   XO, C1, X1,
                                                    C2, X2, C3,     6 6 5 Cn—~1s Xn—13   Cns An   = Y

of vertices and edges from G, starting at vertex x and ending at vertex y and involving the
                   n edges e; = {x;_1, x;}, where 1 <i <n.
                      The length of this walk is n, the number of edges in the walk. (When n = 0, there are no
                   edges, x = y, and the walk is called trivial. These walks are not considered very much in
                   our work.)
                      Any x-y walk where x = y (and n > 1) is called a closed walk. Otherwise the walk is
                   called open.

Note that a walk may repeat both vertices and edges.

EXAMPLE     11.1   For the graph in Fig. 11.4 we find, for example, the following three open walks. We can list
                   the edges only or the vertices only (if the other is clearly implied).

1) {a, 5}, {b, d}, {d, c}, {c, e}, {e, a}, {d, b}: This is an a-b walk of length 6 in which
                         we find the vertices d and b repeated, as well as the edge {b, d} (= {d, b}).
                      2) b>c7d-7-e->c-—             f: Here we have a b-f walk where the length is 5 and the
                         vertex c is repeated, but no edge appears more than once.
                      3) {f, c}, {c, e}, {e, d}, {d, a}: In this case the given fa walk has length 4 with no
                         repetition of either vertices or edges.

Figure 11.4

Since the graph of Fig. 11.4 is undirected, the a-b walk in part (1) is also a b-a walk
                   (we read the edges, if necessary, as {b, d}, {d, e}, {e, c}, {c, d}, {d, b}, and {b, a}). Similar
                   remarks hold for the walks in parts (2) and (3).
                      Finally, the edges {b, c}, {c, d}, and {d, b} provide a b-b (closed) walk. These edges
                   (ordered appropriately) also define (closed) c-c and d-d walks.
516           Chapter 11 An Introduction to Graph Theory

Now let us examine special types of walks.

Definition 11.3          Consider any x-y walk in an undirected graph G = (V, F).
                                 a) If no edge in the x-y walk is repeated, then the walk is called an x-y trail. A closed
                                    x-x trail is called a circuit.
                                 b) If no vertex of the x-y walk occurs more than once, then the walk is called an x-y
                                    path, When x = y, the term cycle is used to describe such a closed path.

Convention: In dealing with circuits, we shall always understand the presence of at least
                               one edge. When there is only one edge, then the circuit is a loop (and the graph is no longer
                               loop-free). Circuits with two edges arise in multigraphs, a concept we shall define shortly.
                                  The term cycle will always imply the presence of at least three distinct edges (from the
                               graph).

a) The b-f walk in part (2) of Example 11.1 is a b-f trail, but it is not a b-f path because
      EXAMPLE 11.2
                                    of the repetition of vertex c. However, the f-a walk in part (3) of that example is both
                                    an f-a trail (of length 4) and an f-a path (of length 4).
                                 b) In Fig. 11.4, the edges {a, b}, {b, d}, {d, c}, {c, e}, {e, d}, and {d, a} provide an a-a
                                    circuit. The vertex d is repeated, so the edges do not give us an a-a cycle.
                                 c) The edges {a, b}, {b, c}, {c, d}, and {d, a} provide an a-a cycle (of length 4) in
                                    Fig. 11.4. When ordered appropriately these same edges may also define a b-b, c-c, or
                                     d-d cycle. Each of these cycles is also a circuit.

For a directed graph we shall use the adjective directed, as in, for example, directed
                               walks, directed paths, and directed cycles.

Before continuing, we summarize (in Table 11.1) for future reference the results of
                               Definitions 11.2 and 11.3. Each occurrence of “Yes” in the first two columns here should
                               be interpreted as “Yes, possibly.” Table 11.1 reflects the fact that a path is a trail, which in
                               turn is an open walk. Furthermore, every cycle is a circuit, and every circuit (with at least
                               two edges) is a closed walk.

Table 11.1

Repeated Vertex | Repeated
                                                 (Vertices)      Edge(s) | Open }         Closed       Name

Yes            Yes        Yes                 Walk (open)
                                                     Yes            Yes                    Yes     Walk (closed)
                                                     Yes             No        Yes                 Trail
                                                     Yes             No                   Yes      Circuit
                                                      No             No        Yes                 Path
                                                      No             No                    Yes     Cycle

Considering how many concepts we have introduced, it is time to prove a first result in
                               this new theory.
                                                                                  11.1 Definitions and Examples           517

THEOREM 11.1         Let G = (V, E) be an undirected graph, with a, b € V,a               # b. If there exists a trail (in G)
                     from a to b, then there is a path (in G) from a to b.
                     Proof: Since there is a trail from a to b, we select one of shortest length, say {a, x;},
                     {x1, X2},..., {Xn, 5}. If this trail is not a path, we have the situation {a, x)}, {x1, x2},...,
                     {Xe—1, Xkbs (Xe. Xe bs Xka1s Meg2},              (m1, Xm}, (ms Xm4i}. -- +» {Xn, b}, where
                     k<m    and x, =X»,     possibly with k = 0 and a (= 2X9) = Xm, or m=n+1                       and x =
                     b (= Xni1). But then we have a contradiction because             {a, x;}, (x), X2},...,      {xn-1, xx},
                     {Xm,Xm+i},.--»,   {Xn, BD} is a shorter trail from a to b.

The notion of a path is needed in the following graph property.

Definition 11.4   Let G = (V, E) be an undirected graph. We call G connected if there is a path between
                     any two distinct vertices of G.
                         Let G = (V, E) beadirected graph. Its associated undirected graph is the graph obtained
                     from G by ignoring the directions on the edges. If more than one undirected edge results
                     for a pair of distinct vertices in G, then only one of these edges is drawn in the associated
                     undirected graph. When this associated graph is connected, we consider G connected.
                         A graph that is not connected is called disconnected.

The graphs in Figs. 11.1, 11.3, and 11.4 are connected. In Fig. 11.2 the graph is not
                     connected because, for example, there is no path from a to e.

In Fig. 11.5 we have an undirected graph on V = {a, b,c, d, e, f, g}. This graph is not
   EXAMPLE 11.3
                     connected because, for example, there is no path from a to e. However, the graph is com-
                     posed of pieces (with vertex sets V; = {a, b, c,d}, V2 = {e, f, g}, and edge sets Fy, =
                     {{a, b}, {a, c}, fa, d}, {b, d}}, Eo = {{e, ff}, Uf, g}}) that are themselves connected, and
                     these pieces are called the (connected) components of the graph. Hence an undirected
                     graph G = (V, £) is disconnected if and only if V can be partitioned into at least two
                     subsets V}, V2 such that there is no edge in E of the form {x, y}, where x € V; and y € V3.
                     A graph is connected if and only if it has only one component.

a

d                   f
                                                    Figure 11.5

Definition 11.5   For any graph G = (V, E), the number of components of G is denoted by «(G).

For the graphs in Figs. 11.1, 11.3, and 11.4, «(G) = | because these graphs are connected;
   EXAMPLE 11.4
                     x(G) = 2 for the graphs in Figs. 11.2 and 11.5.
518            Chapter 11 An Introduction to Graph Theory

Before closing this first section, we extend our concept of a graph. Thus far we have
                                  allowed at most one edge between two vertices; we now consider an extension.

Definition 11.6            Let V be a finite nonempty set. We say that the pair (V, £) determines a multigraph G with
                                  vertex set V and edge set E" if, for some x, y € V, there are two or more edges in E of the
                                 form (a) (x, y) (for a directed multigraph), or (b) {x, y} (for an undirected multigraph). In
                                 either case, we write G = (V, E) to designate the multigraph, just as we did for graphs.

Figure 11.6 shows an example of a directed multigraph. There are three edges from a to
                                 b, so we say that the edge (a, b) has multiplicity 3. The edges (b, c) and (d, e) both have
                                 multiplicity 2. Also, the edge (e, d) and either one of the edges (d, e) form a (directed)
                                 circuit of length 2 in the multigraph.

D
                                                                   b
                                                                  Figure 11.6

We shall need the idea of a multigraph later in the chapter when we solve the problem
                                 of the seven bridges of K6nigsberg. (Note: Whenever we are dealing with a multigraph G,
                                 we shall state explicitly that G is a multigraph.)

4. For n> 2, let G=(V, E) be the loop-free undirected
                                                                            graph, where V is the set of binary n-tuples (of 0’s and 1’s)
                                                                            and E = {{v, w}|v, we V and v, w differ in (exactly) two
1. List three situations, different from those in this section,
                                                                            positions}. Find «(G).
where a graph could prove useful.
2. For the graph in Fig. 11.7, determine (a) a walk from b to                  5. Let G = (V, E) be the undirected graph in Fig. 11.8. How
d that is not a trail; (b) a b-d trail that is not a path; (c) a path       many paths are there in G from a to 4? How many of these
from b to d; (d) a closed walk from b to b that is not a circuit;           paths have length 5?
(e) a circuit from b to b that is not a cycle; and (f) a cycle from                       a             b              -              f
btob.
                          b        e        f

a                                                                       C             d              g              A
                                                  g                                       Figure 11.8
                          c        d
                  Figure 11.7
                                                                              6. Ifa, b are distinct vertices in a connected undirected graph
3. For the graph in Fig. 11.7, how many paths are there from               G, the distance from a to b is defined to be the length of a short-
bto f?                                                                      est path from a to b (when a = b the distance is defined to be

"We now allow a set to have repeated elements in order to account for multiple edges. We realize that this is a
                                 change from the way we dealt with sets in Chapter 3. To overcome this the term muitiset is often used to describe
                                 E in this case.
                                                                                                 11.1 Definitions and Examples          519

0). For the graph in Fig. 11.9, find the distances from d to (each    if and only if its removal (the vertices a and b are left) does not
of) the other vertices in G.                                          disconnect G.

C          k                 £                 10. Give an example of a connected graph G where removing
                      q
                                                                      any edge of G results in a disconnected graph.
                 d«               g                                   11. Let G be a graph that satisfies the condition in Exercise 10.
                                                m                     (a) Must G be loop-free? (b) Could G be a multigraph? (c) If
                                                 j
                                                                      G has nv vertices, can we determine how many edges it has?
                                                                      12. a) If G =(V, £) is an undirected graph with               |V| = v,
                      q
                       e   f      A                 i                     |E| = e, and no loops, prove that 2e < v? — v.
                 Figure 11.9                                              b) State the corresponding inequality for the case when G
                                                                          is directed.
  7. Seven towns a, b,c, d, e, f, and g are connected by a sys-       13. Let G = (V, E) be an undirected graph. Define a relation
tem of highways as follows: (1) I-22 goes from a to c, passing        Ron V bya    KR bif a = b orif there is a path in G from a to b.
through b; (2) I-33 goes from c to d and then passes through b        Prove that & is an equivalence relation. Describe the partition
as it continues to f; (3) 1-44 goes from d through e to a; (4) F-55   of V induced by &.
goes from f to b, passing through g; and (5) I-66 goes from g
                                                                      14. a) Consider the three connected undirected graphs in
tod.
                                                                          Fig. 11.11. The graph in part (a) of the figure consists
    a) Using vertices for towns and directed edges for seg-               of a cycle (on the vertices 4), u2, #3) and a vertex u4 with
    ments of highways between towns, draw a directed graph                edges (spokes) drawn from u, to the other three vertices.
    that models this situation.                                           This graph is called the wheel with three spokes and is
    b) List the paths from g to a.                                        denoted by W3. In part (b) of the figure we find the graph
    c) What is the smallest number of highway segments that
    would have to be closed down in order for travel from b to                                            Ug
    d to be disrupted?
    d) Is it possible to leave town c and return there, visiting
    each of the other towns only once?
    e) What is the answer to part (d) if we are not required to
    return to c?                                                                           uy                             U3

f) Is it possible to start at some town and drive over each
    of these highways exactly once? (You are allowed to visit a
    town more than once, and you need not return to the town                             (a)                              W3
    from which you started.)                                                                              V2
8. Figure 11.10 shows an undirected graph representing a sec-
tion of a department store. The vertices indicate where cashiers
are located; the edges denote unblocked aisles between cashiers.                            Vy
The department store wants to set up a security system where                                                                   v3
(plainclothes) guards are placed at certain cashier locations so
that each cashier either has a guard at his or her location or is
only one aisle away from a cashier who has a guard. What is
                                                                                                           V4
the smallest number of guards needed?                                                    (b)                              W,
                  a                   D         c
                                                                                                                     X3
                                                                                            x2

X4

h        i
                Figure 11.10
                                      j         k                                                K\Y
                                                                                                                xs
                                                                                          (c)                             Ws
9. Let G = (V, FE) bea loop-free connected undirected graph,
and let {a, b} be an edge of G. Prove that {a, b} is part of acycle                      Figure 11.11
520            Chapter 11 An Introduction to Graph Theory

W,—the wheel with four spokes. The wheel W; with five                                       represented by the binary sequence 01. In parts (b), (c) of the
      spokes appears in Fig. 11.11(c). Determine how many cy-                                     figure we have the two unit-interval graphs determined by two
      cles of length 4 there are in each of these graphs.                                         unit intervals. When two unit intervals overlap [as in part (c)] an
      b) In general, if n € Z* and n > 3, then the wheel with n                                   edge is drawn in the unit-interval graph joining the vertices cor-
      spokes is the graph made up of a cycle of length n together                                 responding to these unit intervals. Hence the unit-interval graph
      with an additional vertex that is adjacent to the n vertices                                in part (b) consists of the two isolated vertices vj, v2 that corre-
      of the cycle. The graph is denoted by W,,. (i) How many                                     spond with the nonoverlapping unit intervals. In part (c) the unit
      cycles of length 4 are there in W,,? ii) How many cycles in                                 intervals overlap so the corresponding unit-interval graph con-
      W,, have length n?                                                                          sists of a single edge joining the vertices v,, v2 (that correspond
                                                                                                  to the given unit intervals). A closer look at the unit intervals in
15. For the undirected graph in Fig. 11.12, find and solve a re-
                                                                                                  part (c) reveals how we can represent the positioning of these
currence relation for the number of closed v-v walks of length
                                                                                                  intervals and the corresponding unit-interval graph by the bi-
n> 1,if we allow such a walk, in this case, to contain or consist
                                                                                                  nary sequence 0011. In parts (d)—(f) of the figure we have three

On
of one or more loops.
                                                                                                  of the unit-interval graphs for three unit intervals     — together
                                                                                                  with their corresponding binary sequences.
                                                                                                      a) How many other unit-interval graphs are there for
                                                                                                      three unit intervals? What are the corresponding binary se-
                   Figure 11.12                                                                       quences for these graphs?
16. Unit-Interval Graphs. For n => 1, we start with n closed in-                                      b) How many unit-interval graphs are there for four unit
tervals of unit length and draw the corresponding unit-interval                                       intervals?
graph on n vertices, as shown in Fig. 11.13. In part (a) of the                                        c) For n > 1, how many unit-interval graphs are there for
figure we have one unit interval. This corresponds to the single                                      n unit intervals?
vertex u; both the interval and the unit-interval graph can be

0                   1            0              1    0             1                         0                        1
                                        o——_—_-                           e———o               ee                                 0 1a
                                                                                                                                 ¢——?!                             \
                                                                                                                                    1  |                           |
                                                                                                                                 |            I           |        I
                                                  *u                                *,               *V,                         0    01                       4
                                                                                                                                     o——__e
                                                                                                                                     Vy                       V2
                          (a)                     01                      (b)                                       (c)                   0011
                                              0                      1                        0                 1                                     0                    1
                                         0                       1              0                                                                 0                    1
                                  0                     i                0                                          0                     |
                                  o—_-                                   o—_-_-+—_e                                 o—__+—_e
                                                       W2
                                  Wy;                                    q——_@                          e                  e         o——_———-6
                                                                         W,    Ww                      W3                 WwW,         W2     W3

Ww3
                          (d)                000111                       (e)            001101                     (f)              010011

Figure 11.13

11.2
              Subgraphs, Complements,
               and Graph Isomorphism
                                        In this section we shall focus on the following two ideas:

a) What types of substructures are present in a graph?
                                             b) Is it possible to draw two graphs that appear distinct but have the same underlying
                                                   structure?
                                                              11.2. Subgraphs, Complements, and Graph Isomorphism               521
                      To answer the question in part (a) we introduce the following definition.

Definition 11.7   If G = (V, E) is a graph (directed or undirected), then G,                    = (V|, £)) is called a subgraph
                  of G if 6 # V; C V and E; C E, where each edge in £, is incident with vertices in Vi.

Figure 11.14(a) provides us with an undirected graph G and two of its subgraphs, G, and
                  G. The vertices a, b are isolated in subgraph G). Part (b) of the figure provides a directed
                  example. Here vertex w is isolated in the subgraph G’.

(G)                    (G,)                (Gp)                        (G)                  (G’)
                                          b                   b          b                            5                    s
                                                              e

a           Cc    e    a        Cc               e
                                                     @
                                                                                            t        u    V          t

e
                       (a)                d                   d          d            (b)           WwW                   Ww

Figure 11.14

Certain special types of subgraphs arise as follows:

Definition 11.8   Given a (directed or undirected) graph G = (V, EF), let G,                    = (V,, E;) be a subgraph of G.
                  If V, = V, then G, is called a spanning subgraph of G.

In part (a) of Fig. 11.14 neither G; nor G2 is a spanning subgraph of G. The subgraphs
                  G3 and G4—shown in part (a) of Fig. 11.15 —are both spanning subgraphs of G. The
                  directed graph G’ in part (b) of Fig. 11.14 is a subgraph, but not a spanning subgraph, of
                  the directed graph G given in that part of the figure. In part (b) of Fig. 11.15 the directed
                  graphs G” and G”” are two of the 2* = 16 possible spanning subgraphs.

(G3)                 (Gq)                     (G"')                     (G’’')

Db                                     S                         S

a   C
                                                                   en

e
                                                                                      u         V         t     u

e
                                               d                                      Ww                        Ww
                                    (a)

Figure 11.15
522           Chapter 11   An Introduction to Graph Theory

Definition 11.9           Let G = (V, E) be a graph (directed or undirected). If @ # U C V, the subgraph ofG
                                induced by U is the subgraph whose vertex set is U and which contains all edges (from G)
                                of either the form (a) (x, y), for x, y € U (when G is directed), or (b) {x, y}, forx, ye U
                                (when G is undirected). We denote this subgraph by (U).
                                   A subgraph G’ of a graph G = (V, E) is called an induced subgraph if there exists
                                A~#U CV, where G’ = (U).

For the subgraphs in Fig. 11.14(a), we find that G2 is an induced subgraph of G but the
                                subgraph G, is not an induced subgraph because edge {a, d} is missing.

Let G = (V, E) denote the graph in Fig. 11.16(a). The subgraphs in parts (b) and (c) of the
      EXAMPLE 11.5
                                figure are induced subgraphs of G. For the connected subgraph in part (b), G; = (U,) for
                                U, = {b, c, d, e}. In like manner, the disconnected subgraph in part (c) is G2 = (U2) for
                                Uy = {a, b, e, f}. Finally, G3 in part (d) of Fig. 11.16 is a subgraph of G. But it is not an
                                induced subgraph; the vertices c, e are in G3, but the edge {c, e} (of G) is not present.

(G)                       (G,)                 (G3)                  (G3)
                                                            C                     C                                               C

b                         b                     b                         b

d       e             d       e         )           e                         e
                                            a                                                   a       a             a       Va

f                                               f                         f
                                     (a)                            (b)                   (c)                   (d)
                                    Figure 11.16

Another special type of subgraph comes about when a certain vertex or edge is deleted
                                from the given graph. We formalize these ideas in the following definition.

Definition 11.10          Let v be a vertex in a directed or an undirected graph G = (V, E£). The subgraph of G
                                denoted by G — v has the vertex set V; = V — {v} and the edge set EF, C E, where E,
                                contains all the edges in FE except for those that are incident with the vertex v. (Hence
                                G — vis the subgraph of G induced by Vj.)
                                  In a similar way, if e is an edge of a directed or an undirected graph G = (V, E), we
                                obtain the subgraph G — e = (V,, E)) of G, where the set of edges E, = E — {e}, and the
                                vertex set is unchanged (that is, V; = V).

Let G = (V, E) be the undirected graph in Fig. !1.17(a). Part (b) of this figure is the
      EXAMPLE 11.6
                                subgraph G, (of G), where G; = G —c. It is also the subgraph of G induced by the set
                                of vertices U; = {a, b, d, f, g, h}, so G; = (V — {c}) = (U)). In part (c) of Fig. 11.17
                                we find the subgraph G2 of G, where G2 = G —e for e the edge {c, d}. The result in
                                Fig. 11.17(d) shows how the ideas in Definition 11.10 can be extended to the deletion of
                                more than one vertex (edge). We may represent this subgraph of G as G3; = (G — b) — f =
                                (G — f) -b=G — {b, f} = (U3), for U3 = {a, c, d, g, h}.
                                                                11.2. Subgraphs, Complements, and Graph Isomorphism                     523

(G)                               (G))                              (G>)                                   (G3)

Cc                                                                Cc
                                                                                                                        g
                                                                                          d
                                                                                                         f
                                           h                                 A                                      A                   A
                    (a)                               (b)                               (c)                                 (d)
                   Figure 11.17

The idea of a subgraph gives us a way to develop the complement of an undirected
                   loop-free graph. Before doing so, however, we define a type of graph that is maximal in
                   size for a given number of vertices.

Definition 11.11   Let V bea set of n vertices. The complete graph on V, denoted K,,, is a loop-free undirected
                   graph, where for all a, b € V, a # Bb, there is an edge {a, b}.

Figure 11.18 provides the complete graphs K,,, for 1 <n                            < 4. We shall realize, when we
                   examine the idea of graph isomorphism, that these are the only possible complete graphs
                   for the given number of vertices.

a                        a               a              b

a
                                               e

Cc
                                                                         b       c                b | d
                                        (K;)                   (K)               (K3)                        (Ka)

Figure 11.18

In determining the complement of a set in Chapter 3, we needed to know the universal
                   set under consideration. The complete graph plays a role similar to a universal set.

Definition 11.12   Let G be a loop-free undirected graph on n vertices. The complement of G, denoted G, is
                   the subgraph of K, consisting of the n vertices in G and all edges that are not in G. (If
                   G = K,,, G 1s a graph consisting of n vertices and no edges. Such a graph is called a null
                   graph.)

Figure 11.19(a) shows an undirected graph on four vertices. Its complement is shown in
                   part (b) of the figure. In the complement, vertex a is isolated.

Once again we have reached a point where many new ideas have been defined. To
                   demonstrate why some of these ideas are important, we apply them now to the solution of
                   an interesting puzzle.
524         Chapter 11 An Introduction to Graph Theory

d                        Cc         d

(a)                                  (b)
                                                         Figure 11.19

Instant Insanity. The game of Instant Insanity is played with four cubes. Each of the six
      EXAMPLE 11.7           faces on a cube is painted with one of the colors red (R), white (W), blue (B), or yellow (Y).
                             The object of the game is to place the cubes in a column of four such that all four (different)
                             colors appear on each of the four sides of the column.
                                 Consider the cubes in Fig. 11.20 and number them as shown. (These cubes are only one
                             example of this game. Many others exist.) First we shall estimate the number of arrange-
                             ments that are possible here. If we wish to place cube | at the bottom of the column, there
                             are at most three different ways in which we can do this. In Fig. 11.20 cube | is unfolded,
                             and we see that it makes no difference whether we place the red face on the table or the
                             opposite white face on the table. We are concerned only with the other four faces at the
                             base of our column. With three pairs of opposite faces there will be at most three ways
                             to place the first cube for the base of the column. Now consider cube 2. Although some
                             colors are repeated, no pair of opposite faces has the same color. Hence we have six ways
                             to place the second cube on top of the first. We can then rotate the second cube without
                             changing either the face on the top of the first cube or the face on the bottom of the second
                             cube. With four possible rotations we may place the second cube on top of the first in as
                             many as 24 different ways. Continuing the argument, we find that there can be as many as
                             (3)(24) (24) (24) = 41,472 possibilities to consider. And there may not even be a solution!

Y                                  R

Wi    RI]   Y | W                Bi;    BIW         IY

B                                Y

(1)                             (2)

R                                WwW

R|BtY|B                         Wi}     R]    BY

Ww                                WwW

(3)                             (4)
                             Figure 11.20                                                           Figure 11.21

In solving this puzzle we realize that it is difficult to keep track of (1) colors on opposite
                             faces of cubes and (2) columns of colors. A graph (actually a labeled multigraph) helps us
                             to visualize the situation. In Fig. 11.21 we have a graph on four vertices R, W, B, and Y.
                             As we consider each cube, we examine its three pairs of opposite faces. For example, cube
                                       11.2 Subgraphs, Complements, and Graph Isomorphism     525

1 has a pair of opposite faces painted yellow and blue, so we draw an edge connecting Y
and B and label it 1 (for cube 1). The other two edges in the figure that are labeled with 1
account for the pairs of opposite faces that are white and yellow, and red and white. Doing
likewise for the other cubes, we arrive at the graph in the figure. A loop, such as the one at
B, with label 3, indicates a pair of opposite faces with the same color (for cube 3).
    In the graph we see a total of 12 edges falling into four sets of 3, according to the labels
for the cubes. At each vertex the number of edges incident to (or from) the vertex counts
the number of faces on the four cubes that have that color. (We count a loop twice.) Hence
Fig. 11.21 tells us that for our four cubes we have five red faces, seven white ones, six blue
ones, and six that are yellow.
    With the four cubes stacked in a column, we examine two opposite sides of the column.
This arrangement gives us four edges in the graph of Fig. 11.21, where each label appears
once. Since each color is to appear only once on a side of the column, each color must
appear twice as an endpoint of these four edges. If we can accomplish the same result for
the other two sides of the column, we have solved the puzzle. In Fig. 11.22(a) we see that
each side in one pair of opposite sides of our column has the four colors if the cubes are
arranged according to the information provided by the subgraph shown there. However, to
accomplish this for the other two sides of the column also, we need a second such subgraph
that doesn’t use any edge in part (a). In this case a second such subgraph does exist, as
shown in part (b) of the figure.

(a)                                        (b)
                  Figure 11.22

Figure 11.23 shows how to arrange the cubes as indicated by the subgraphs in Fig. 11.22.

Y                    B                     WwW               R

W             R |     Y        B      R                Y |    8        W
                         B                   Ww                      R                Y
                   (1)                 (2)              (3)                     (4)
                  Figure 11.23

In general, for any four cubes we construct a labeled multigraph and try to find two
subgraphs where (1) each subgraph contains all four vertices, and four edges, one for each
label; (2) in each subgraph, each vertex is incident with exactly two edges (a loop is counted
twice); and (3) no (labeled) edge of the labeled multigraph appears in both subgraphs.

Now we turn to the second question posed at the start of the section.
526           Chapter 11   An Introduction to Graph Theory

Parts (a) and (b) of Fig. 11.24 show two undirected graphs on four vertices. Since straight
                                edges and curved edges are considered the same here, each graph represents six adjacent
                                pairs of vertices. In fact, we probably feel that these graphs are both examples of the graph
                                K4. We make this feeling mathematically rigorous in the following definition.

a              b          Ww              x         m             n            r        5

c              d              y               2         p             q            t        ul

(a)                           (b)                       (c)                        (d)
                                     Figure 11.24

Definition 11.13          Let G;       = (Vi, £1) and G2 = (V2, Ey) be two undirected graphs. A function f: V; >                       V2
                                is called a graph isomorphism if (a) f is one-to-one and onto, and (b) for all a, b € Vj,
                                {a, b} € E, if and only if { f(a), f(b)} € Er. When such a function exists, G,; and G2 are
                                called isomorphic graphs.

The vertex correspondence of a graph isomorphism preserves adjacencies. Since which
                                pairs of vertices are adjacent and which are not is the only essential property of an undirected
                                graph, in this way the structure of the graphs is preserved.
                                    For the graphs in parts (a) and (b) of Fig. 11.24 the function f defined by

fl@=w,                 f(b) =x,        fle) =y,               f(d) =z
                                provides an isomorphism. [In fact, any one-to-one correspondence between {a, b, c, d} and
                                {w, x, ¥, z} will be an isomorphism because both of the given graphs are complete graphs.
                                This would also be true if each of the given graphs had only four isolated vertices (and no
                                edges).] Consequently, as far as (graph) structure is concerned, these graphs are considered
                                the same  — each is (isomorphic to) the complete graph K4.
                                   For the graphs in parts (c) and (d) of Fig. 11.24 we need to be a little more careful. The
                                function g defined by

g(m) =r,              a(n) =s,       g(p) =t,                g(q) =u
                                is one-to-one and onto (for the given vertex sets). However, although {m, g} is an edge in the
                                graph of part (c), {g(m), g(q)} = {r, u} is not an edge in the graph of part (d). Consequently,
                                the function g does not define a graph isomorphism. To maintain the correspondence of
                                edges, we consider the one-to-one onto function # where

h(m) =s,               h(n) =1,       h(p) =u,                h(q) =t.

In this case we have the edge correspondences

{m,n} <> {h(m), h(n)} = {s, r},                     {n, gq} > {A(n), h(qg)} = {r, th,
                                                {m, p} > {h(m), h(p)} = {s, u},                 {p,q} <= {h(p), h(q)} = {u, th,
                                                {m, q} <> {h(m), h(q)} = {s, 8},
                                                  11.2 Subgraphs, Complements, and Graph lsomorphism    527

so h is a graph isomorphism. [We also notice how, for example, the cyclem > n > q>m
               corresponds with the cycle s (= h(m)) > r (= h(n)) > t (= h(g)) > 5 (= A(m)).]
                   Finally, since the graph in part (a) of Fig. 11.24 has six edges and that in part (c) has
               only five edges, these two graphs cannot be isomorphic.

Now let us examine the idea of graph isomorphism in a more difficult situation.

In Fig. 11.25 we have two graphs, each on ten vertices. Unlike the graphs in Fig. 11.24, it
EXAMPLE 11.8
               is not immediately apparent whether or not these graphs are isomorphic.

ay | 7G
                                             PRI | YE)
                                             C                      v                           5

Figure 11.25

One finds that the correspondence given by

aq          cou         e—>r         gx           i—>zZ

b> v        d—>y        frw          h-t          jwvs

preserves all adjacencies. For example, { f, 4} is an edge in graph (a) with {w, t} the cor-
               responding edge in graph (b). But how did we come up with the correspondence? The
               following discussion provides some clues.
                  We note that because an isomorphism preserves adjacericies, it preserves graph sub-
               structures such as paths and cycles. In graph (a) the edges {a, f}, {f, i}, {i, d}, {d, e},
               and {e, a} constitute a cycle of length 5. Hence we must preserve this as we try to find an
               isomorphism. One possibility for the corresponding edges in graph (b) is {q¢, w}, {w, z},
               {z, y}, {y, r}, and {r, g}, which also provides a cycle of length 5. (A second possible
               choice is given by the edges in the cycle y > r > s >t —>u— y.) In addition, start-
               ing at vertex a in graph (a), we find a path that will “visit” each vertex only once. We
               express this path bya > f ~h+>c~>+b—>g-             j >e-d - i. For the graphs       to be
               isomorphic there must be a corresponding path in graph (b). Here the path described by
               q>wotoeu>v                3% x>s—>r- y- Zis the counterpart.

These are some of the ideas we can use to try to develop an isomorphism and deter-
               mine whether two graphs are isomorphic. Other considerations will be discussed through-
               out the chapter. However, there is no simple, foolproof method — especially when we are
               confronted with larger graphs G, = (V;, £,) and G2 = (V2, F2), where |V;| = |V2| and
               |E\| = |Eo|.
                 We close this section with one more example involving graph isomorphism.
528            Chapter 11     An Introduction to Graph Theory

Each of the two graphs in Fig. 11.26 has six vertices and nine edges. Therefore it is reason-
      EXAMPLE 11.9                  able to ask whether they are isomorphic.
                                        In graph (a), vertex a is adjacent to two other vertices of the graph. Consequently, if
                                    we try to construct an isomorphism between these graphs, we should associate vertex a
                                    with a comparable vertex in graph (b), say vertex u. A similar situation exists for vertex d
                                    and either vertex x or vertex z. But no matter which of the vertices x or z we use, there
                                    remains one vertex in graph (b) that is adjacent to two other vertices. And there is no other
                                    such vertex in graph (a) to continue our one-to-one structure preserving correspondence.
                                    Consequently, these graphs are not isomorphic.
                                        Furthermore, in graph (b) it is possible to start at any vertex and find a circuit that includes
                                    every edge of the graph. For example, if we start at vertex u, the circuit u—> w—>v—>
                                    yowroz>yox>v > u exhibits this property. This does not happen in graph (a)
                                    where the only trails that include each edge start at either b or f and then terminate at f or
                                    b, respectively.

(a)                                (b)
                                                       Figure 11.26

d) Draw the subgraph of G induced by the set of vertices
                             EXERCISES 11.2                                     U = {b, c,d, f, i, j}.

1. Let G be the undirected graph in Fig. 11.27(a).                             e) For the graph G, let the edge e = {c, f}. Draw the sub-
                                                                                graph G — e.
      a) How many connected subgraphs of G have four vertices
      and include a cycle?                                                     . a) Let G = (V, E) be an undirected graph, with G; =
                                                                                 (V,, £,) a subgraph of G. Under what condition(s) is G,
      b) Describe the subgraph G, (of G) in part (b) of the fig-
                                                                                 not an induced subgraph of G?
      ure first, as an induced     subgraph   and second,     in terms of
                                                                                b) For the graph G in Fig. 11.27(a), find a subgraph that is
      deleting a vertex of G.
                                                                                not an induced subgraph.
      c) Describe the subgraph G2 (of G) in part (c) of the figure
      first, as an induced    subgraph   and second,   in terms of the
                                                                               . a) How many spanning subgraphs are there for the graph
      deletion of vertices of G.                                                G in Fig. 11.27(a)?

(G)                                   (G,)                           (G>)

b                             be

f

J    A
                                                            (b)                            (c)

Figure 11.27
                                                                      11.2    Subgraphs, Complements, and Graph Isomorphism                     529

1
                                                                                                           R       4        Ww

12
                                                                                                         4|2   39           1|3

B       3        Y
                                                                                                                   4

(c)

Figure 11.28

b) How many connected spanning subgraphs are there in                                                  a                      5
    part (a)?
    c) How many of the spanning subgraphs in part (a) have
    vertex a as an isolated vertex?
4. If G = (V, E) is an undirected graph, how many spanning                                    “>                       u         7
subgraphs of G are also induced subgraphs?
5. Let G = (V, E) be an undirected graph, where |V| > 2. If
every induced subgraph of G is connected, can we identify the
graph G?
                                                                                                           A                      z
6. Find all (loop-free) nonisomorphic undirected graphs with
                                                                                         (a)
four vertices. How many of these graphs are connected?
  7. Each of the labeled multigraphs in Fig. 11.28 arises in the                                     a                            u   Vv
                                                                                                                                       >
analysis of a set of four blocks for the game of Instant Insanity.
In each case determine a solution to the puzzle, if possible.                                                                              bx
8. a) How many paths of length 4 are there in the complete                          f                         b
    graph K7? (Remember that a path such as v) > v2 >
    V3 —> U4 — Us is considered to be the same as the path                          e                          c
    Us —> U4 —> U3 —>   V2 >   Vy.)
    b) Let m,n € Z* with m <n. How many paths of length                                                                                    ,
                                                                                                     d                      y              Z
    m are there in the complete graph K,,?
                                                                                         (b)
  9, For each pair of graphs in Fig. 11.29, determine whether or
not the graphs are isomorphic.                                                     Figure 11.29
10. Let G be an undirected (loop-free) graph with v vertices
and e edges. How many edges are there in G?
11. a) If G,, G are (loop-free) undirected graphs, prove that
    G,, G» are isomorphic if and only if G,, G2 are isomor-
    phic.
    b) Determine whether the graphs in Fig. 11.30 are isomor-
    phic.
12. a) Let G be an undirected graph with n vertices. If G is iso-
    morphic to its own complement G, how many edges must
    G have? (Such a graph is called self-complementary.)
    b) Find an example of a self-complementary graph on four                  Figure 11.30
    vertices and one on five vertices.
    c) If G is aself-complementary graph on 7 vertices, where
    n> 1, prove thatn = 4k orn = 4k + 1, for somek € Z*.                     14. a) Find a graph G where both G and G are connected.
13. Let G be a cycle on 7 vertices. Prove         that G   is self-             b) If G is a graph on 7 vertices, for n > 2, and G is not
complementary if and only ifn = 5.                                              connected, prove that G is connected.
530            Chapter 11 An Introduction to Graph Theory

15. a) Extend Definition 11.13 to directed graphs.
      b) Determine whether the directed graphs in Fig. 11.31 are
      isomorphic.
16. a) How many subgraphs H = (V, E) of Kg satisfy |V| =
    3? (If two subgraphs are isomorphic but have different ver-                             d
    tex sets, consider them distinct.)
      b) How many subgraphs H = (V, E) of Kg satisfy
      |V| = 4?                                                           e

c) How many subgraphs does K have?                                 Figure 11.31
      d) For n > 3, how many subgraphs does K,, have?
17. Let v, w be two vertices in K,, n > 3. How many walks of
length 3 are there from v to w?

11.3
    Vertex Degree: Euler Trails and Circuits
                               In Example [1.9 the number of edges incident with a vertex was used to show that two
                               undirected graphs were not isomorphic. We now find this idea even more helpful.

Definition 11.14         Let G be an undirected graph or multigraph. For each vertex v of G, the degree of v, written
                               deg(v), is the number of edges in G that are incident with v. Here a loop at a vertex v is
                               considered as two incident edges for v.

For the graph in Fig. 11.32, deg(b) = deg(d) = deg(f) = deg(g) = 2, deg(c) = 4,
|     EXAMPLE 11.10
                               deg(e) = 0, and deg(h) = 1. For vertex a we have deg(a) = 3 because we count a loop
                               twice. Since h has degree 1, it is called a pendant vertex.

Figure 11.32

Using the idea of vertex degree, we have the following result.

THEOREM 11.2                   If G = (V, E) is an undirected graph or multigraph, then         uev deg(v) = 2\E|.
                               Proof: As we consider each edge {a, b} in graph G, we find that the edge contributes a count
                               of 1 to each of deg(a), deg(b), and consequently a count of 2 to Yo vev deg(v). Thus 2|E|
                               accounts for deg(v), for all v € V, and yO nev deg(v) = 2|/E|.
                                                                 11.3 Vertex Degree: Euler Trails and Circuits   531

This theorem provides some insight into the number of odd-degree vertices that can exist
                     in a graph.

COROLLARY 11.1       For any undirected graph or multigraph, the number of vertices of odd degree must be even.
                     Proof: We leave the proof for the reader.

We apply Theorem 11.2 in the following example.

EXAMPLE 11.11 |   An undirected graph (or multigraph) where each vertex has the same degree is called a
                     regular graph. If deg(v) = k for all vertices v, then the graph is called k-regular. Is it
                     possible to have a 4-regular graph with 10 edges?
                        From Theorem 11.2, 2|E| = 20 = 4|V]|, so we have five vertices of degree 4. Figure
                     11.33 provides two nonisomorphic examples that satisfy the requirements.

(a)                              (b)
                                     Figure 11.33

If we want each vertex to have degree 4, with 15 edges in the graph, we find that
                     2|E| = 30 = 4|V|, from which it follows that no such graph 1s possible.

Our next example introduces a regular graph that arises in the study of computer archi-
                     tecture.

The Hypercube. In order to build a parallel computer one needs to have multiple CPUs
   EXAMPLE 11.12
                     (central processing units), where each such processor works on part of a problem. But often
                     we cannot actually decompose a problem completely, so at some point the processors (each
                     with its own memory) have to be able to communicate with one another.
                         We envisage this situation as follows. The accumulated data for a given problem are
                     taken from a central storage location and divided up among the processors. The processors
                     go through a phase where each computes on its own for a certain period of time and then
                     some intercommunication takes place. Then the processors return to computing on their
                     own and continue back and forth between operating individually and communicating with
                     one another. This situation adequately describes how parallel algorithms work in practice.
                         To model the communication between the processors we use a loop-free connected
                     undirected graph where each processor is assigned a vertex. When two processors, say p,
                     P2, are able to communicate directly with one another we draw the edge {p), p2} to represent
                     this (line of) possible communication. How can we decide on a model (that is, a graph) to
                     speed up the processing time? The complete graph (on all of our processors as vertices)
532   Chapter 11 An Introduction to Graph Theory

would be ideal — but prohibitively expensive because of all the necessary connections. On
                       the other hand, one can connect 7 processors along a path with n — 1 edges or on a cycle
                       with n edges. Another possible model is a grid (or, mesh) graph, examples of which are
                       shown in Fig. [1.34.

Py           P2        P3              Pa                Ps                Dy                D2           P3         Da

P56         P7         Pg              Po               P10                Ds               Pe           P7          Dg

Do               Pio          P14         Pi2
                                    P14         P12        P43             P14              P1s

P43           Pi4             Pis         Pig

(a) Two-by-four grid                                                       (b) Three-by-three grid

Figure 11.34

But in these last three models the distances (as measured by the number of edges in
                       the shortest paths) between pairs of processors get longer and longer as the number of
                       processors increases. A compromise that weighs the number of edges (direct connections)
                       against the distance between pairs of vertices (processors) is embodied in the regular graph
                       called the hypercube.
                           For n € N, the n-dimensional hypercube (or n-cube) is denoted by Q,. It is a loop-free
                       connected undirected graph with 2” vertices. For n > 1, these vertices are labeled by the
                       2” n-bit sequences representing 0, 1, 2,..., 2” — 1. For instance, Q3 has eight vertices—
                       labeled 000, 001, 010, OIL,               100,      101,            110, and     111. Two            vertices v), v2 of Q, are joined
                       by the edge {v,, v2} when the binary labels for v,, v2 differ in exactly one position. Then
                       for any vertices u, w in Q, there is a shortest path of length d, when d is the number of
                       positions where the binary labels for u, w differ. [This insures that Q,, is connected. }
                           Figure 11.35 shows Q, forn = 0, 1, 2, 3. In general, forn > 0, Q,41 can be constructed
                       recursively from two copies of Q, as follows. Prefix the vertex labels of one copy of Q,
                       with 0 (call the result Qy,,) and those of the other copy with | (call this result Q.,). Forx in
                       Qo.n and y in Q,, draw the edge {x, y} if the (newly prefixed) binary labels for x, y differ
                       only in the first (newly prefixed) position. The case forn = 3 (son + 1 = 4) is demonstrated
                       in Fig. 11.36. The blue edges are the new edges described above for constructing Q4 from
                       two copies of Q3.

011                                111

010         110

0          00                          10                  000         100

001                                101
                                                 Qo |                                 Q2                                    Q3
                                                Figure 11.35
                                                             11.3. Vertex Degree: Euler Trails and Circuits          533

0011                               0111      1011                                      1111

0010          0110                      1010                    1110

ee          aa            ee
                                    0000                                                            1100
                                                   0100                    1000

0001                               0101      1001                                      1101

Figure 11.36

In summary, we reiterate that for n € N, the hypercube Q, is an n-regular loop-free
                undirected graph with 2” vertices. Further, it is connected with the distance between any
                two vertices at most n. From Theorem 11.2 it follows that Q, has (1/2)n2" = n2"—! edges.
                [Referring back to Example 10.33, we find that 2"! is likewise the number of edges for the
                Hasse diagram of the partial order (P(X,), ©), where X, = {1, 2,3,..., nj and P(X,)
                is the power set of X,,. This is no mere coincidence! If we use the Gray code of Example
                3.9 to label the vertices of this Hasse diagram, we find we have the hypercube Q,.]
                    Finally, note that in Q, there are 16 vertices (processors) and the longest distance between
                vertices is 4. Contrast this with the grids in Fig. 11.34, where there are 15 vertices in part
                (a) and 16 in part (b)   — yet the longest distance is 6 in both grids.

We turn now to the reason why Euler developed the idea of the degree of a vertex: to
                solve the problem dealing with the seven bridges of K6nigsberg.

The Seven Bridges of Kénigsberg. During the eighteenth century, the city of Konigsberg
EXAMPLE 11.13
                (in East Prussia) was divided into four sections (including the island of Kneiphof) by the
                Pregel River. Seven bridges connected these regions, as shown in Fig. 11.37(a). It was said
                that residents spent their Sunday walks trying to find a way to walk about the city so as to
                cross each bridge exactly once and then return to the starting point.

IT Ar
                                           ae
                              (a)                                            (b)
                          Figure 11.37

In order to determine whether or not such a circuit existed, Euler represented the four
                sections of the city and the seven bridges by the multigraph shown in Fig. 11.37(b). Here
534           Chapter 11 An Introduction to Graph Theory

he found four vertices with deg(a) = deg(c) = deg(d) = 3 and deg(b) = 5. He also found
                               that the existence of such a circuit depended on the number of vertices of odd degree in the
                               graph.

Before proving the general result, we give the following definition.

Definition 11.15         Let G = (V, £) be an undirected graph or multigraph with no isolated vertices. Then G is
                               said to have an Euler circuit if there is a circuit in G that traverses every edge of the graph
                               exactly once. If there is an open trail from a to b in G and this trail traverses each edge in
                               G exactly once, the trail is called an Euler trail.

The problem of the seven bridges is now settled as we characterize the graphs that have
                               an Euler circuit.

THEOREM 11.3                   Let G = (V, E) be an undirected graph or multigraph with no isolated vertices. Then G
                               has an Euler circuit if and only if G is connected and every vertex in G has even degree.
                               Proof: If G has an Euler circuit, then for alla, b € V there is a trail froma to b — namely, that
                               part of the circuit that starts at a and terminates at b. Therefore, it follows from Theorem 11.1
                               that G is connected.
                                  Let s be the starting vertex of the Euler circuit. For any other vertex v of G, each time
                               the circuit comes to v it then departs from the vertex. Thus the circuit has traversed either
                               two (new) edges that are incident with v or a (new) loop at v. In either case a count of
                               2 is contributed to deg(v). Since v is not the starting point and each edge incident to v
                               is traversed only once, a count of 2 is obtained each time the circuit passes through v, so
                               deg(v) is even. As for the starting vertex s, the first edge of the circuit must be distinct from
                               the last edge, and because any other visit to s results in a count of 2 for deg(s), we have
                               deg(s) even.
                                   Conversely, let G be connected with every vertex of even degree. If the number of edges
                               in G is | or 2, then G must be as shown in Fig. 11.38. Euler circuits are immediate in these
                               cases. We proceed now by induction and assume the result true for all situations where there
                               are fewer than n edges. If G has n edges, select a vertex s in G as a starting point to build an
                               Euler circuit. The graph (or multigraph) G is connected and each vertex has even degree,
                               so we can at least construct a circuit C containing s. (Verify this by considering the longest
                               trail in G that starts at s.) Should the circuit contain every edge of G, we are finished. If
                               not, remove the edges of the circuit from G, making sure to remove any vertex that would
                               become isolated. The remaining subgraph K has all vertices of even degree, but it may not
                               be connected. However, each component of K is connected and will have an Euler circuit.
                               (Why?) In addition, each of these Euler circuits has a vertex that is on C. Consequently,
                               starting at s we travel on C until we arrive at a vertex s, that is on the Euler circuit of a

a             a          a

Figure 11.38
                                                                   11.3. Vertex Degree: Euler Trails and Circuits   535

component C, of K. Then we traverse this Euler circuit and, returning to s;, continue on
                      C until we reach a vertex s> that is on the Euler circuit of component C2 of K. Since the
                      graph G   is finite, as we continue this process we construct an Euler circuit for G.

Should G be connected and not have too many vertices of odd degree, we can at least
                      find an Euler trail in G.

COROLLARY 11.2        If G is an undirected graph or multigraph with no isolated vertices, then we can construct
                      an Euler trail in G if and only if G is connected and has exactly two vertices of odd degree.
                      Proof: If G is connected and a and b are the vertices of G that have odd degree, add an
                      additional edge {a, b} to G. We now have a graph G;, that is connected and has every vertex
                      of even degree. Hence G, has an Euler circuit C, and when the edge {a, b} is removed from
                      C, we obtain an Euler trail for G. (Thus the Euler trail starts at one of the vertices of odd
                      degree and terminates at the other odd vertex.) We leave the details of the converse for the
                      reader.

Returning now to the seven bridges of K6nigsberg, we realize that Fig. [1.37(b) is a
                      connected multigraph, but it has four vertices of odd degree. Consequently, it has no Euler
                      trail or Euler circuit.

Now that we have seen how the solution of an eighteenth-century problem led to the
                      start of graph theory, is there a somewhat more contemporary context in which we might
                      be able to apply what we have learned?
                          To answer this question (in the affirmative), we shall state the directed version of Theo-
                      rem 11.3. But first we need to refine the concept of the degree of a vertex.

Definition 11.16   Let G = (V, E) be a directed graph or multigraph. For each v € V,

a) The incoming, or in, degree of v is the number of edges in G that are incident into v,
                           and this is denoted by id(v).
                        b) The outgoing, or out, degree of v is the number of edges in G that are incident from
                           v, and this is denoted by od(v).

For the case where the directed graph or multigraph contains one or more loops, each
                      loop at a given vertex v contributes a count of | to each of id(v) and od(v).

The concepts of the in degree and the out degree for vertices now lead us to the following
                      theorem.

THEOREM 11.4          Let G = (V, F) be a directed graph or multigraph with no isolated vertices. The graph G
                      has a directed Euler circuit if and only if G is connected and id(v) = od(v) forall v € V.
                      Proof: The proof of this theorem is left for the reader.

At this time we consider an application of Theorem !1.4. This example is based on a
                      telecommunication problem given by C. L. Liu on pages 176—1!78 of reference [23].
536         Chapter 11 An Introduction to Graph Theory

In Fig. 11.39(a) we have the surface of a rotating drum that is divided into eight sectors of
      EXAMPLE 11.14
                             equal area. In part (b) of the figure we have placed conducting (shaded sectors and inner cir-
                             cle) and nonconducting (unshaded sectors) material on the drum. When the three terminals
                             (shown   in the figure) make contact with the three designated sectors, the nonconducting
                             material results in no flow of current and a ! appears on the display of a digital device.
                             For the sectors with the conducting material, a flow of current takes place and a 0 appears
                             on the display in each case. If the drum were rotated 45 degrees (clockwise), the screen
                             would read 110 (from top to bottom). So we can obtain at least two (namely, 100 and 110)
                             of the eight binary representations from 000 (for 0) to 111 (for 7). But can we represent all
                             eight of them as the drum continues to rotate? And could we extend the problem to the 16
                             four-bit binary representations from 0000 through 1111, and perhaps generalize the results
                             even further?

(a)                                (b)
                                            Figure 11.39

To answer the question for the problem in the figure, we construct a directed graph
                             G = (V, E), where V = {00, 01, 10, 11} and
                                                                      E is constructed as follows: If b\b2, b2b3 € V,
                             draw the edge (b;b2, b2b3). This results
                                                                  in the directed graph of Fig. 11.40(a), where |£| = 8.
                             We see that this graph is connected and that for all v € V, id(v) = od(v). Consequently,
                             by Theorem 11.4, it has a directed Euler circuit. One such circuit is given by
                                              100          000          001         O10     101          O11          111
                                      cl?           > 00         > 00         » 01 ——> 10         > Ol         > {1         > 1]
                                                                                                                                   )
                                                                                     110

Here the label on each edge e = (a, c), as shown in part (b) of Fig. 11.40, is the three-bit
                             sequence x|x2x3, where a = x; x2 andc = x2x3. Since the vertices of G are the four distinct
                             two-bit sequences 00, 01, 10, and 11, the labels on the eight edges of G determine the eight
                             distinct three-bit sequences. Also, any two consecutive edge labels in the Euler circuit are
                             of the form y; y2y3 and y2y3y4.
                                 Starting with the edge label 100, in order to get the next label, 000, we concatenate the
                             last bit in 000, namely 0, to the string 100. The resulting string 1000 then provides 100
                             (1000) and 000 (1000). The next edge label is 001, so we concatenate the 1 (the last bit in
                             001) to our present string 1000 and get 10001, which provides the three distinct three-bit
                             sequences 100 (10001), 000 (10001 ), and 001 (10001). Continuing in this way, we arrive at
                             the eight-bit sequence 10001011 (where the last 1 is wrapped around), and these eight bits
                             are then arranged in the sectors of the rotating drum as in Fig. 11.41. It is from this figure
                             that the result in Fig. 11.39(b) is obtained. And as the drum in Fig. 11.39(b) rotates, all of
                             the eight three-bit sequences         100, 110, 111, O11, 101, 010, 001, and 000 are obtained.
                                                                              11.3 Vertex Degree: Euler Trails and Circuits           537

01                       10
                                                                                                                                      Start

ir’)
                                                      11

Ws
                                 (a)

Figure 11.40                                                                            Figure 11.41

In closing this section, we wish to call the reader’s attention to reference [24] by Anthony
                               Ralston. This article is a good source for more ideas and generalizations related to the
                               problem discussed in Example 11.14.

a                       b

1. Determine | V| for the following graphs or multigraphs G.                                                 c

a) G has nine edges and all vertices have degree 3.
                                                                                                    d              e
   b) G is regular with 15 edges.
    c) G has 10 edges with two vertices of degree 4 and all
   others of degree 3.                                                                                        f

2. If G=(V, E) is a connected graph with |£| = 17 and                                         g                        A
deg(v) > 3 for all v € V, what is the maximum value for | V|?
                                                                                      G, =(V, F,)
3. Let G = (V, E) be aconnected undirected graph.
   a) What is the largest possible value for |V| if |E| = 19                                   5                        t
   and deg(v) > 4 for allu € V?
                                                                                                              u
   b) Draw     a graph   to demonstrate     each possible   case in
   part (a).
                                                                                                     V             w
4, a) Let G = (V, E) be a loop-free undirected graph, where
    |V| = 6 and deg(v) = 2 for all v € V, Up to isomorphism
    how many such graphs G are there?                                                                         xX

b) Answer part (a) for |V| = 7.                                                            y                        Z

c) Let G; = (V), £)) be a loop-free undirected 3-regular                           G>   =       (V>, E>)

graph with |V|| = 6. Up to isomorphism how many such
                                                                                    Figure 11.42
   graphs G, are there?
   d) Answer part (c) for |V,| = 7 and G, 4-regular.                     b) Find the degree of each vertex in V;. Do likewise for
    e) Generalize the results in parts (c) and (d).                      each vertex in V>.

5. Let G, = (V;, F,) and G2 = (V2, Ex)          be the loop-free         c) Are the graphs G; and G» isomorphic?
undirected connected graphs in Fig. 11.42.                             6. Let V = {a,b, c,d, e, f}. Draw three nonisomorphic
    a) Determine |V;|, ||, |V2|, and |Z].                             loop-free undirected graphs G, = (V, E;), G2 = (V, E>), and
538            Chapter 11 An Introduction to Graph Theory

G3 = (V, £3), where, in all three graphs, we have deg(a) = 3,         15. For all k ¢ Z* where k > 2, prove that there exists a loop-
deg(b) = deg(c) = 2, and deg(d) = deg(e) = deg(f) = 1.                free connected undirected graph G = (V, E), where |V| = 2k
7. a) How many different paths of length 2 are there in the          and deg(v) = 3 for all v € V.
      undirected graph G in Fig. 11.43?                               16. Prove that for each n € Z* there exists a loop-free con-
      b) Let G = (V, E) be a loop-free undirected graph, where        nected undirected graph G =(V, EF), where |V| = 2n and
      V = {vy, v2,..., v,} and deg(v,) = d,, for all 1 <i <n.         which has two vertices of degree i for every 1 <i <n.
      How many different paths of length 2 are there in G?
                                                                      17. Complete the proofs of Corollaries 11.1 and 11.2.

18. Let k be a fixed positive integer and let G = (V, EF) be
                                                                      a loop-free undirected graph, where deg(v) > & for all v € V.
                                                                      Prove that G contains a path of length k.
                                                                      19. a) Explain why it is not possible to draw a loop-free con-
                                                                          nected undirected graph with eight vertices, where the de-
                                                                          grees of the vertices are 1, 1, 1, 2, 3, 4, 5, and 7.
                                                                          b) Give an example of a loop-free connected undirected
                    Figure 11.43                                          multigraph with eight vertices, where the degrees of the
                                                                          vertices are 1, 1, 1, 2,3, 4,5, and 7.

8. a) Find the number of edges in Qx.                                20. a) Find an Euler circuit for the graph in Fig. 11.44.

b) Find the maximum distance between pairs of vertices              b) If the edge {d, e} is removed from this graph, find an
      in Qg. Give an example of one such pair that achieves this          Euler trail for the resulting subgraph.
      distance.
      c) Find the length of a longest path in Qs.                                          a         b                             c

9. a) What is the dimension of the hypercube with 524,288
    edges?
      b) How many vertices         are there for a hypercube   with                   d                  .         f                   Dg
      4,980,736 edges?
10. For n € Z*, how many distinct (though isomorphic) paths
of length 2 are there in the n-dimensional hypercube Q,,?                                  h         j        j                    k
11. Let n € Z*, with n > 9. Prove that if the edges of K, can                        Figure 11.44
be partitioned into subgraphs isomorphic to cycles of length
4 (where any two such cycles share no common edge), then              21. Determine the value(s) of x for which the complete graph
  = 8k +1 forsomek € Z*,                                              K,, has an Euler circuit. For which» does X,, have an Euler trail
                                                                      but not an Euler circuit?
12. a) Forn > 2, let V denote the vertices in Q,. For 1 <k <
    &£ <n, define the relation AR on V as follows: If w, x € V,       22. For the graph in Fig. 11.37(b), what is the smallest number
    then w R x if w andx have the same bit (0, or 1) in position      of bridges that must be removed so that the resulting subgraph
    k and the same bit (0, or 1) in position @ of their binary la-    has an Euler trail but not an Euler circuit? Which bridge(s)
    bels. [For example, ifn = 7 andk = 3,£ = 6, then 1100010          should we remove?
      FR 0000011.) Show that &R is an equivalence relation. How
      many blocks are there for this equivalence relation? How        23. When visiting a chamber of horrors, Paul and David try to
      many vertices are there in each block? Describe the sub-        figure out whether they can travel through the seven rooms and
      graph of Q,, induced by the vertices in each block.             surrounding corridor of the attraction without passing through
                                                                      any door more than once. If they must start from the starred po-
      b) Generalize the results of part (a).
                                                                      sition in the corridor shown in Fig. 11.45, can they accomplish
13. If G is an undirected graph with n vertices and e edges, let      their goal?
§ = min  cy {deg(v)} and let A = max,cy {deg(v)}. Prove that
6 <2(e/n) <A.                                                         24. Let G = (V, E) be a directed graph, where                          |V| = 7” and
                                                                      |E| = e. What are the values for )> ,-y id(v) and )> .-y od(v)?
14. Let G = (V, E), H = (V’, E’) be undirected graphs with
f:V— V’ establishing an isomorphism between the graphs.               25. a) Find the maximum length of a trail in
(a) Prove that f~': V’ > V         is also an isomorphism for G and             i) Ks                        ii) Kg
H.(b) Ifa € V, prove that deg(a@) (in G) = deg(f(a)) (in A).                  iii)   Kio                     iv)       Kay,   ne        Zt
                                                                                    11.3. Vertex Degree: Euler Trails and Circuits                                         539

Pf ae
                                                                          If E = {e), e2,..., eg}, the incidence matrix I is then X k
                                                                      matrix (b,;)nxx Where b,, = 1 if v, is a vertex on the edge e,,
                                                                      otherwise b,, = 0.
                                                                          a) Find the adjacency and incidence matrices associated

Lt          1
                                                                          with the graph in Fig. 11.46.
                                                                          b) Calculating A? and using the Boolean operations where
                                                                          04+0=0,0+1=14+0=1+4+1=1,and0-0=0-1=
              Figure 11.45                                                 1-0 =0,                  1-1        = 1, prove that the entry in row / and col-
                                                                          umn j of A? is 1 if and only if there is a walk of length 2
    b) Find the maximum length of a circuit in                            between the ith and jth vertices of V.
          i) Kg                       ii) Ky                               c) If we calculate A” using ordinary addition and multipli-
       lil)   Kio                    iv)   Kone     Zt                    cation, what do the entries in the matrix reveal about G?
26. a) Let G = (V, F) be adirected graph or multigraph with               d) What is the column sum for each column of A? Why?
    no isolated vertices. Prove that G has a directed Euler cir-
                                                                           e) What is the column sum for each column of /? Why?
    cuit if and only if G is connected and od(v) = id(v) for all
    veV.

b) A directed graph is called strongly connected if there
                                                                                                                                                          e
    is a directed path from a to b for all vertices a, b, where
    a # b. Prove that if a directed graph has a directed Euler                                                                                   4
    circuit, then it is strongly connected. Is the converse true?                                                                                                     &7
27, Let G be a directed graph on n vertices. If the associ-                                     e                            es             &6                  Vs,
ated undirected graph for G is K,,, prove that )* .y[od(v)’ =
Y evlid(v)P.                                                                                                                                     &9
                                                                                                V3                  e
28. IfG = (V, £)isadirected graph or multigraph with no iso-                                                            8              M4                 ©10
lated vertices, prove that G has a directed Euler trail if and only                                  en
if (i) G is connected; (1i) od(v) = id(v) for all but two vertices                           Figure 11.46
x, yin V; and (iii) od(x) = id(x) + 1, id(y) = od(y) + 1.
29. Let V = {000, 001,010, ..., 110, 141}. For each four-bit
sequence b\b.b3b4 draw an edge from the element b, bb; to             33. Determine whether or not the loop-free undirected graphs
the element b2b3b, in V. (a) Draw the graph G = (V, EF) as            with the following adjacency matrices are isomorphic.
described. (b) Find a directed Euler circuit for G. (c) Equally                       00 1                              Oo 4 ]
space eight 0’s and eight 1°s around the edge of a rotating (clock-        a)         10 0  1                            1 0                0
wise) drum so that these 16 bits form a circular sequence where                     fi   1 0                             10                 0
the (consecutive) subsequences of length 4 provide the binary
representations of 0, 1, 2,..., 14, 15 in some order.                               ro   1 014)   fO  11   414
                                                                                       10411        1010
30. Carolyn and Richard attended a party with three other mar-
                                                                          b)          01   0  a;7}1   4°01
ried couples. At this party a good deal of handshaking took
                                                                                    fi   1 1 Of [1   O   1 OF
place, but (1) no one shook hands with her or his spouse; (2) no
one shook hands with herself or himself; and (3) no one shook                       TO  61l6d1l                  61d]         hdcOlh1 Ol
hands with anyone more than once. Before leaving the party,                     '     101                        0              101                   0
Carolyn asked the other seven people how many hands she or                 Ol                   1         0      O10                        101
he had shaken. She received a different answer from each of the                     f1          0         0      Of           [1            0    1    O01
seven. How many times did Carolyn shake hands at this party?          34. Determine whether or not the loop-free undirected graphs
How many times did Richard?                                           with the following incidence matrices are isomorphic.
31. Let G = (V, FE) bea loop-free connected undirected graph                             1      0         1             0          1         |
with |V| > 2. Prove that G contains two vertices v, w, where               a)]O                 1          1             1         1        0
deg(v) = deg(w).                                                                         1      1         0             1         0          ]
32. If G = (V, E) is an undirected graph with |V| = 7” and                            101                        1    i1001
|E| =k, the following matrices are used to represent G.                               1 1 0                      0    1 10 0
     Let V = {v1, v2,..., v,}. Define the adjacency matrix A =                 Dy )o    1  1                     of loi   10
(4,; nxn Where a,, = 1 if {u,, v,} € FE, otherwise a,, = 0.                          0  0 0                      1   00  1 1
540            Chapter 11 An Introduction to Graph Theory

0    0     0    ]            1   1         0    0                      cal levels: left, or first (000), second (001), third (011), fourth

C
                1   1     0    1           0   1           1   =O                     (010), and right, or fifth (110). Use the elements of A x B to


      c)        1   0      1
                               0    |?     0   0          0     1                     label the 15 processors of this grid; for example, p, is labeled


               0     1     1   0            10             1    1   0                 (00,000), pz is labeled (00, 001), pg is labeled (01,011), pry is
35. There are 15 people at a party. Is it possible for each of                        labeled (11, 010), and pjs is labeled (11, 110). Show that the
these people to shake hands with (exactly) three others?                              two-by-four grid is isomorphic to a subgraph of the hypercube
36. Consider the two-by-four grid in Fig. 11.34. Assign the par-                      Qs. (Thus we can consider the two-by-four grid to be embedded
tial Gray code A = {00, 01, 11} to the three horizontal levels:                       in the hypercube Qs.)
top (00), middle (01), and bottom (11). Now assign the par-                           37. Prove that the three-by-three grid of Fig. 11.34 is isomor-
tial Gray code B = {000, 001, 011, 010, 110} to the five verti-                       phic to a subgraph of the hypercube Q4.

11.4
                         Planar Graphs
                                         On aroad map the lines indicating the roads and highways usually intersect only at junctions
                                         or towns. But sometimes roads seem to intersect when one road is located above another,
                                         as in the case of an overpass. In this case the two roads are at different levels, or planes.
                                         This type of situation leads us to the following definition.

Definition 11.17                  A graph (or multigraph) G is called planar if G can be drawn in the plane with its edges
                                         intersecting only at vertices of G. Such a drawing of G is called an embedding of G in the
                                         plane.

The graphs in Fig. 11.47 are planar. The first is a 3-regular graph, because each vertex has
      EXAMPLE 11.15
                                         degree 3; it is planar because no edges intersect except at the vertices. In graph (b) it appears
                                         that we have a nonplanar graph; the edges {x, z} and {w, y} overlap at a point other than a
                                         vertex. However, we can redraw this graph as shown in part (c) of the figure. Consequently,
                                         K is planar.

a
                                                                                             Ww              x           Ww              x

He           EN
                                                          b                      Cc          2               y           Zz              y
                                                    (a)                                (b)                         (¢)
                                                    Figure 11.47

Just as K, is planar, so are the graphs K,, K2, and K3.
      EXAMPLE 11.16
                                             An attempt to embed K‘s in the plane is shown in Fig. 11.48. If Ks were planar, then any
                                         embedding would have to contain the pentagon in part (a) of the figure. Since a complete
                                         graph contains an edge for every pair of distinct vertices, we add edge {a, c} as shown in
                                         part (b). This edge is contained entirely within the interior of the pentagon in part (a). (We
                                         could have drawn the edge in the exterior region determined by the pentagon. The reader
                                         will be asked in the exercises to show that the same conclusion arises in this case.) Moving
                                                                                       11.4 Planar Graphs           541

(c)
                                     Figure 11.48

to part (c), we add in the edges {a, d}, {c, e}, and {b, e}. Now we consider the vertices b and
                   d. We need the edge {b, d} in order to have Ks. Vertex d is inside the region formed by the
                   cycle edges {a, c}, {c, e}, and {e, a}, whereas b is outside the region. Thus in drawing the
                   edge {b, d}, we must intersect one of the existing edges at least once, as shown by the dotted
                   edges in part (d). Consequently, Ks is nonplanar. (Since this proof appeals to a diagram, it
                   definitely lacks rigor. However, later in the section we shall prove that Ks is nonplanar by
                   another method.)

Before we can characterize all nonplanar graphs we need to examine another class of
                   graphs.

Definition 11.18   A graph G = (V, E) is called bipartite if V = V, U V2 with V; M V2 = @, and every edge
                   of G is of the form {a, b} with a € V, and b € V2. If each vertex in Vj is joined with every
                   vertex in V2, we have a complete bipartite graph. In this case, if |V|| =m,              |V2| =n, the
                   graph is denoted by Ky,n.

Figure 11.49 indicates how we may partition the vertices of the hypercubes          Q), Q2, Q3 to
EXAMPLE 11.17
                   demonstrate that these graphs are bipartite. In general, for each n > |, partition the vertices
                   of Q, as V,; U V2, where V; consists of all vertices whose binary labels have an even number
                   of I’s, while V2 consists of those whose binary labels have an odd number of 1’s. Could
                   there exist an edge {x, y} in Q, where x, y € V,? Recall that edges in Q, connect vertices
                   that differ in exactly one of the » positions in their binary labels. Suppose that the binary
                   labels of x, y differ only in position i, for some     ! <i <n.   Then the total number of !’s
                   in the binary labels for x, y is 2 - [the number of I|’s in x (or y) in all positions other than
                   position 7] + 1, an odd total. But with x, y € V), their binary labels each contain an even
                   number of |’s       —so the total number of 1’s in these binary labels is even! This contradiction
                   tells us that there is no edge {x, y} in Q, where x, y € V,. Asimilar argument can be given
542           Chapter 11   An Introduction to Graph Theory

to rule out the possibility of an edge {u, w}, where u. w € V2. Consequently, Q,, is bipartite
                                foralln > 1.

011                         114
                                                            1      01                11
                                                                                                       010          110

0       00                10                000          100

001                        101
                                                      V,={0} | V, = {00, 11}                   V, = {000, 011, 101, 110}
                                                      V>={1} | V, = {01, 10}                   V> = {001, 010, 100, 111}
                                                          (Q))          (Q2)                                 (Q3)
                                                    Figure 11.49

Figure 11.50 shows two bipartite graphs. The graph in part (a) satisfies the definition
                                for V; = {a, b} and V2 = {c, d, e}. If we add the edges {b, d} and {b, c}, the result is
                                the complete bipartite graph K23, which is planar. Graph (b) of the figure is K3,3. Let
                                V, = (41, ho, hg} and V2 = {u1, u2, 43}, and interpret V; as a set of houses and V> as a set
                                of utilities. Then K3 3 is called the utility graph. Can we hook up each of the houses with
                                each of the utilities and avoid having overlapping utility lines? In Fig. {1.50(b) it appears
                                that this is not possible and that K3 3 is nonplanar. (Once again we deduce the nonplanarity
                                of a graph from a diagram. However, we shall verify that K33 is nonplanar by another
                                method, later in Example !1.21 of this section.)

c                            hy              h3

a

d
                                                      b

Uy
                                                                         e

(a)                        (b)
                                                   Figure 11.50

We shall see that when we are dealing with nonplanar graphs, either Ks or K3.3 will be
                                the source of the problem. Before stating the general result, however, we need to develop
                                one final new idea.

Definition 11.19          Let G = (V, E) bea loop-free undirected graph, where E # @. An elementary subdivision
                                of G results when an edge e = {u, w} is removed from G and then the edges {u, v}, {v, w}
                                are added to G — e, where v ¢ V.
                                   The loop-free undirected graphs G, = (V,, E,) and G2 = (V>, E2) are called homeo-
                                morphic if they are isomorphic or if they can both be obtained from the same loop-free
                                undirected graph H by a sequence of elementary subdivisions.
                                                                                     11.4 Planar Graphs          543

a) Let G = (V, E) be a loop-free undirected graph with |£| > [. If G’ is obtained from
   EXAMPLE 11.18
                         G by an elementary subdivision, then the graph G’ = (V’, E’) satisfies |V’| = |V| + 1
                              and |E’| = |E| + 1.
                     b) Consider the graphs G, G,, G2, and G3 in Fig. 11.51. Here G, is obtained from G
                        by means of one elementary subdivision: Delete edge {a, b} from G and then add
                        the edges {a, w} and {w, b}. The graph G2 is obtained from G by two elementary
                        subdivisions. Hence G, and G2 are homeomorphic. Also, G3 can be obtained from G
                              by four elementary subdivisions, so G3 is homeomorphic to both G,        and G2.

(G)                         (G;)                   (Gp)                  (G3)
                       a                   b       a             b       a             b        a                b

y         x           y             x

Zz
                          e                d          e          d           e         d           e             d
                    (a)                         (b)                    (c)                   (d)
                   Figure 11.51

However, we cannot obtain G, from G2 (or G2 from G) by a sequence of elemen-
                              tary subdivisions. Furthermore, the graph G3 can be obtained from either G; or Gz
                              by a sequence of elementary subdivisions: six (such sequences of three elementary
                              subdivisions) for G; and two for G2. But neither G; nor G2 can be obtained from G3
                              by a sequence of elementary subdivisions.

One may think of homeomorphic graphs as being isomorphic except, possibly, for ver-
                   tices of degree 2. In particular, if two graphs are homeomorphic, they are either both planar
                   or they are both nonplanar.
                          These preliminaries lead us to the following result.

THEOREM 11.5       Kuratowski’s Theorem. A graph is nonplanar if and only if it contains a subgraph that is
                   homeomorphic to either Ks or K3 3,
                   Proof: (This theorem was first proved by the Polish mathematician Kasimir Kuratowski in
                   1930.) If a graph G has a subgraph homeomorphic to either Ks or K33, it is clear that G
                   is nonplanar. The converse of this theorem, however, is much more difficult to prove. (A
                   proof can be found in Chapter 8 of C. L. Liu [23] or Chapter 6 of D. B. West [32].)

We demonstrate the use of Kuratowski’s Theorem in the following example.

a) Figure {1.52(a) is a familiar graph called the Petersen graph. Part (b) of the figure
   EXAMPLE 11.19
                        provides a subgraph of the Petersen graph that is homeomorphic to K3.3. (Figure 11.53
                        shows how the subgraph is obtained from K33 by a sequence of four elementary
                        subdivisions.) Hence the Petersen graph is nonplanar.
                     b) In part (a) of Fig. 11.54 we find the 3-regular graph G, which is isomorphic to the 3-
                        dimensional hypercube Q3. The 4-regular complement of G is shown in Fig. 11.54(b),
                        where the edges {a, g} and {d, f} suggest that G may be nonplanar. Figure 11.54(c)
544   Chapter 11   An Introduction to Graph Theory

depicts a subgraph H of G that is homeomorphic to Ks, so by Kuratowski’s Theorem
                              it follows that G is nonplanar.

a                              J
                                                                                                                           ad

. KS                             .

d
                                                                        an              C                              g
                                                     (a)                                             (b)
                                                    Figure 11.52

b                                                               b

g
                                  ())                                            (i!)                          (iu)

J                                      j
                                                                                            d                              d

Cc                             Cc

b                              b

g                              g
                                                     (Iv)                                            (v)

Figure 11.53

a                   b                           a                  c

(a)             G(Q3)                               (b)              G(Q3)           (c)            H
                            Figure 11.54

When a graph or multigraph is planar and connected, we find the following relation,
                         which was discovered by Euler. For this relation we need to be able to count the number
                         of regions determined by a planar connected graph or multigraph — the number (of these
                         regions) being defined only when we have a planar embedding of the graph. For instance,
                         the planar embedding of K, in part (a) of Fig. 11.55 demonstrates how this depiction of K4
                         determines four regions in the plane: three of finite area— namely, R1, R2, and R3—and
                                                                                                  11.4 Planar Graphs   545

the infinite region Ry. When we look at Fig. 11.55(b) we might think that here K4 determines
               five regions, but this depiction does nor present a planar embedding of K4. So the result in
               Fig. 11.55(a) is the only one we actually want to deal with here.

a                        b                  a              b

Ry                 R3

Ro

d                        C                  d              c
                                      (a)                       Ry                (b)
                                     Figure 11.55

THEOREM 11.6   Let G = (V. E) beaconnected planar graph or multigraph with |V| = v and |£| = e. Letr
               be the number of regions in the plane determined by a planar embedding (or, depiction) of
               G; one of these regions has infinite area and is called the infinite region. Thenv —e +r = 2.
               Proof: The proof is by induction on e. Ife = Oor |, then G is isomorphic to one of the graphs in
               Fig. 11.56. The graph in part (a) has v =             1,e = O,andr            = l;so,u-—e+r=1-—-0+1     =2.
               For graph (b), v = 1, e = 1, andr = 2. Graph (c) has v = 2,e = |, andr = 1. In both cases,
               v—e+r=2.

(a)                  (b)      ©             ()
                                                Figure 11.56

Now let k € N and assume that the result is true for every connected planar graph or
               multigraph with e edges, where 0 < e <k. If G = (V, E) is a connected planar graph or
               multigraph with v vertices, r regions, and e = k + 1 edges, let a, b € V with {a, b} € E.
               Consider the subgraph H of G obtained by deleting the edge {a, b} from G. (If G is a
               multigraph and {a, b} is one of a set of edges between a and b, then we remove it only
               once.) Consequently, we may write H = G — {a, b} or G = H +{a, b}. We consider the
               following two cases, depending on whether # is connected or disconnected.

Case 1: The results in parts (a), (b), (c), and (d) of Fig. 11.57 show us how a graph G may be
               obtained from a connected graph H when the (new) loop {a, a} is drawn as in parts (a) and
               (b) or when the (new) edge {a, b} joins two distinct vertices in H as in parts (c) and (d). In all
               of these situations, H has v vertices, k edges, and r — | regions because one of the regions
               for H is split into two regions for G. The induction hypothesis applied to graph Htells us
               that v —k + (r — 1) = 2, and from this it follows that   2 =v — (k+1)+r=v-—e4r.
               So Euler’s Theorem is true for G in this case.
546   Chapter 11   An Introduction to Graph Theory

Figure 11.57

Case 2: Now we consider the case where G — {a, b} = H is a disconnected graph [as
                          demonstrated in Fig. 11.57(e) and (f)]. Here H has v vertices, k edges, and r regions. Also,
                          H has two components H, and M2, where H; has v; vertices, e; edges, and r; regions,
                          for i = 1, 2. [Part (e) of Fig. 11.57 indicates that one component could consist of just
                          an isolated vertex.] Furthermore, vj + v2 = v, e} te2 =k (=e-—1), andr; +r =r4+1
                          because each of H, and H> determines an infinite region. When we apply the induction
                          hypothesis to each of H, and H> we learn that

vy) —e;   tr,   =2    and    w-—e.t+tr     =2.

Consequently, (v; + v2) — (e; + e2) + (4)       ro) =v —(e-1)+ (4+ 1) = 4, and from
                          this it follows that v — e +r = 2, thus establishing Euler’s Theorem for G in this case.

The following corollary for Theorem 11.6 provides two inequalities relating the number
                          of edges in a loop-free connected planar graph G with (1) the number of regions determined
                          by a planar embedding of G; and (2) the number of vertices in G. Before we examine this
                          corollary, however, let us look at the following helpful idea. For each region RF in a planar
                          embedding of a (planar) graph or multigraph, the degree of R, denoted deg(R), is the number
                          of edges traversed in a (shortest) closed walk about (the edges in) the boundary of R. If
                          G = (V, E£) is the graph of Fig. 11.58(a), then this planar embedding of G has four regions
                          where

deg(R;) =5,            deg(R2) = 3,          deg(R3) = 3,         deg(R4) = 7.

[Here deg(R4) = 7, as determined by the closed walk:a > b> g           ~>h>g—->f—>d-
                          a.] Part (b) of the figure shows a second planar embedding of G — again with four regions —
                          and here

deg(Rs) = 4,           deg( Rs) = 3,         deg(R7) =5,          deg(Rg) = 6.

[The closed walk b > g ~ h-> g > f — b gives us deg(R7) = 5.)]
                             We see that yt deg(R;) = 18 =       )°8_, deg(R;) = 2-9 = 2|E|. This is true in general
                          because each edge of the planar embedding is either part of the boundary of two regions
                          [like {b, c} in parts (a) and (b)] or occurs twice in the closed walk about the edges in the
                          boundary for one region [like {g, 4} in parts (a) and (b)].
                                                                                               11.4. Planar Graphs               547

C

a             b                                a     Rg        b
                                       R;                     Ra                                             Re

Cc                                           Re                A

R,            R3             g
                                                                                                      R,          g
                                                                          A
                                 d             f                                 d               f
                               (a)                                            (b)
                             Figure 11.58
                     Now let us consider the following.

COROLLARY 11.3    Let G = (V, E) be a loop-free connected planar graph with |V| = v, |E| =e > 2, andr
                  regions. Then 3r < 2e and e < 3u — 6.
                  Proof: Since G is loop-free and is not a multigraph, the boundary of each region (includ-
                  ing the infinite region) contains at least three edges
                                                                       — hence, each region has degree > 3.
                  Consequently, 2e = 2|E| = the sum of the degrees of the r regions determined by G and
                  2e>3r.     From    Euler’s       Theorem,            2=v—e+r<v—e+4                 (2/3)e =v        —(1/3)e,    so
                  6 <3u —e, ore <3u — 6.

We now consider what this corollary does and does not imply. If G = (V. E) is a loop-
                  free connected graph with |E£| > 2, then if e > 3v — 6, it follows that G is not planar.
                  However, if e < 3v — 6, we cannot conclude that G is planar.

The graph Ks is loop-free and connected with ten edges and five vertices. Consequently,
i EXAMPLE 11.20   3v —6 = 15 —-6=9 < 10 =e. Therefore, by Corollary 11.3, we find that Ks is nonplanar.

| EXAMPLE 11.21   The graph K3 3 is loop-free and connected with nine edges and six vertices. Here 3v — 6 =
                  18 —6 = 12 >9 =e. It would be a mistake to conclude from this that K3.3 is planar. It
                  would be the mistake of arguing by the converse.
                     However, K33 is nonplanar. If K3 3 were planar, then since each region in the graph is
                  bounded by at least four edges, we have 4r < 2e. (We founda similar situation in the proof of
                  Corollary 11.3.) From Euler’s Theorem, v — e +r = 2,0orr =e -—-v+2=9-64+2=5,
                  so 20 = 4r < 2e = 18. From this contradiction we have K3,.3 being nonplanar.

We use Euler’s Theorem to characterize the Platonic solids. [For these solids all faces are
  EXAMPLE 11.22
                  congruent and all (interior) solid angles are equal.] In Fig. 11.59 we have two of these
                  solids. Part (a) of the figure shows the regular tetrahedron, which has four faces, each an
                  equilateral triangle. Concentrating on the edges of the tetrahedron, we focus on its underlying
                  framework. As we view this framework from a point directly above the center of one of the
                  faces, we picture the planar representation in part (b). This planar graph determines four
                  regions (corresponding to the four faces); three regions meet at each of the four vertices.
                  Part (c) of the figure provides another Platonic solid, the cube. Its associated planar graph
                  is given in part (d). In this graph there are six regions with three regions meeting at each
                  vertex.
548   Chapter 11   An Introduction to Graph Theory

(a)                     (b)                   (©)                  (d)
                                 Figure 11.59

On     the basis of our observations         for the regular tetrahedron      and the cube,    we   shall
                        determine the other Platonic solids by means of their associated planar graphs. In these
                        graphs G = (V, E) wehavev = |V|;e = |E|;r =the number          of planar regions determined
                        by G; m = the number of edges in the boundary of each region; and n = the number of
                        regions that meet at each vertex. Thus the constants m, n > 3. Since each edge is used in the
                        boundary of two regions and there are r regions, each with m edges, it follows that 2e = mr.
                        Counting the endpoints of the edges, we get 2e. But all these endpoints can also be counted
                        by considering what happens at each vertex. Since n regions meet at each vertex, n edges
                        meet there, so there are n endpoints of edges to count at each of the v vertices. This totals
                        nv endpoints of edges, so 2e = nv. From Euler’s Theorem we have
                                                                         2e          2e    2m —mn+2n
                                            0<2=v-etr                   aoe          Bae ( MOE)
                                                                          n          m         mn
                        With e, m,n        > 0, we find that
                                                 2m —mn+2n>0=>
                                                           mn — 2m —2n                         <0
                                                 => mn         —2m   —2n+4<4=>
                                                                          (m —2)(n — 2) < 4.

Since m,n > 3, we have (m — 2), (n — 2) € Z*, and there are only five cases to consider:
                            1) (m — 2) = (n-2) =|1jm=n=3                              (The regular tetrahedron)
                            2) (m — 2) = 2,(n        —2)        = 1;m   =4,n    =3   (The cube)
                            3) (m — 2) = 1, (n — 2) = 2;m               =3,n=4        (The octahedron)
                            4) (m — 2) = 3, (n-—2)              =1;m=5,n=3            (The dodecahedron)
                            5) (m — 2) = 1, (n —2)              =3;m=3,n=5_           (The icosahedron)

The planar graphs for cases 3-5 are shown in Fig. 11.60.

Octahedron                         Dodecahedron                       Icosahedron

Figure 11.60
                                                                                    11.4 Planar Graphs     549

The last idea we shall discuss for planar graphs is the notion of a dual graph. This
                   concept is also valid for planar graphs with loops and for planar multigraphs. To construct
                   a dual (relative to a particular embedding) for a planar graph or multigraph G with V =
                   {a, b, c,d, e, f}, place a point (vertex) inside each region, including the infinite region,
                   determined by the graph, as in Fig. 11.61(a). For each edge shared by two regions, draw
                   an edge connecting the vertices inside these regions. For an edge that is traversed twice in
                   the closed walk about the edges of one region, draw a loop at the vertex for this region.
                   In Fig. 11.61(b), G4 is a dual for the graph G = (V, E). From this example we make the
                   following observations:

1) An edge in G corresponds with an edge in G“, and conversely.
                      2) A vertex of degree 2 in G yields a pair of edges in G? that connect the same two
                         vertices. Hence G@ may be a multigraph. (Here vertex e provides the edges {a, e},
                         {e, f} in G that brought about the two edges connecting v and z in G“.)
                      3) Given a loop in G, if the interior of the (finite area) region determined by the loop
                         contains no other vertex or edge of G, then the loop yields a pendant vertex in G?.
                         (It is also true that a pendant vertex in G yields a loop in G4.)
                      4) The degree of a vertex in G@ is the number of edges in the boundary of the closed
                         walk about the region in G that contains that vertex.

(a)            G=(V, EF)             (b)              Gd
                              Figure 11.61

(Why is G@ called a dual of G instead of the dual of G? The Section Exercises will show
                   that it is possible to have isomorphic graphs G,; and G> with respective duals G¢, G4 that
                   are not isomorphic.)

In order to examine further the relationship between a graph G and a dual G? of G, we
                   introduce the following idea. [Here we recall (from Definition 11.5) that «(G) counts the
                   number of components of G.]

Definition 11.20   Let G = (V, £) be an undirected graph or multigraph. A subset £’ of EF is called a cut-set
                   of G if by removing the edges (but not the vertices) in E’ from G, we have k(G) < «(G’),
                   where G’ = (V, E — E'); but when we remove (from £) any proper subset E” of E’, we
                   have «(G) = x(G”"), for G” =(V, E — EB”).
550         Chapter 11 An Introduction to Graph Theory

For a given connected graph, a cut-set is a minimal disconnecting set of edges. In the graph
      EXAMPLE 11.23          in Fig. 11.62(a), note that each of the sets {{a, b}, {a, c}}, {{a. b}, {c, d}}, (fe, h}, Uf A},
                             {g. h}}, and {{d. f}} is a cut-set. For the graph in part (b) of the figure, the edge set {{n, p},
                             {r, p}, {r, s}} is a cut-set. Note that the edges in this cut-set are not all incident to some
                             single vertex. Here the cut-set separates the vertices m, n, r from the vertices p, s, t. The
                             edge set {{s, £}} is also a cut-set for this graph — the removal of the edge {s, r} from this
                             connected graph results in a subgraph with two components, one of which is the isolated
                             vertex f.

Figure 11.62

Whenever a cut-set for a connected graph consists of only one edge, that edge is called
                             a bridge for the graph. For the graph in Fig. 11.62(a), the edge {d, f} is the only bridge;
                             the edge {s, t} is the only bridge in part (b) of the figure.

We return now to the graphs in Fig. 11.61, redrawing them as shown in Fig. [1.63 in
                             order to emphasize the correspondence between their edges.

Figure 11.63

Here the edges in G are labeled 1, 2, ... , 10. The numbering scheme for G¢ is obtained
                             as follows: The edge labeled 4*, for example, connects the vertices w and z in G?. We drew
                             this edge because edge 4 in G was a common edge of the regions containing these vertices.
                             Likewise, edge 7 is common to the region containing x and the infinite region containing
                             v. Hence we label the edge in G¢ that connects x and v with 7*.
                                 In graph G the set of edges labeled 6, 7, 8 constitutes a cycle. What about the edges
                             labeled 6*, 7*, 8* in G4? If they are removed from G4, then vertex x becomes isolated
                             and G¢ is disconnected. Since we cannot disconnect G4 by removing any proper subset
                                                                                    11.4 Planar Graphs        551

of {6*, 7*, 8*}, these edges form a cut-set in G“. In similar fashion, edges 2, 4, 10 forma
                  cut-set in G, whereas in G4 the edges 2*, 4*, 10* yield a cycle.
                     We also have the two-edge cut-set {3, 10} in G, and we find that the edges 3*, 10* provide
                  a two-edge circuit in G?. Another observation: The one-edge cut set {1*} in G4 comes about
                  from edge 1, a loop in G.
                     In general, there is a one-to-one correspondence between the following sets of edges in
                  a planar graph G and a dual G4 of G.
                     1) Cycles (cut-sets) of n (> 3) edges in G correspond with cut-sets (cycles) of n edges
                        in G4,
                     2) Aloop in G corresponds with a one-edge cut-set in G%.
                     3) A one-edge cut-set in G corresponds with a loop in G?.
                     4) Atwo-edge cut-set in G corresponds with a two-edge circuit in G?.
                     5) If G is a planar multigraph, then each two-edge circuit in G determines a two-edge
                        cut-set of G4.

All these theoretical observations are interesting, but let us stop here and see how we
                  might apply the idea of a dual.

If we consider the five finite regions in Fig. 11.64(a) as countries on a map, and we construct
EXAMPLE 11.24
                  the subgraph (because we do not use the infinite region) of a dual as shown in part (b), then
                  we find the following relationship.
                      Suppose we are confronted with the “mapmaker’s problem” whereby we want to color
                  the five regions of the map in part (a) so that two countries that share a common border are
                  colored with different colors. This type of coloring can be translated into the dual notion of
                  coloring the vertices in part (b) so that adjacent vertices are colored with different colors.
                  (Such coloring problems will be examined further in Section 11.6.)

(OR)
                                   (a)                                (b)
                                  Figure 11.64

The final result for this section provides us with an application for an electrical network.
                  This material is based on Example 8.6 on pp. 227-230 of the text by C. L. Liu [23].

In Fig. 11.65 we see an electrical network with nine contacts (switches) that control the
EXAMPLE 11.25 |   excitation of a light. We want to construct a dual network where a second light will go on
                  (off) whenever the light in our given network is off (on).
                      The contacts (switches) are of two types: normally open (as shown in Fig. 11.65) and
                  normally closed. We use a and a’ as in Fig. 11.66 to represent the normally open and
                  normally closed contacts, respectively.
552   Chapter 11 An Introduction to Graph Theory

TAPS
                 Figure 11.65
                                                                             |                 [A]
                                                                                             Figure 11.66

In Fig. 11.67(a) a one-terminal-pair-graph represents the network in Fig. 11.65. Here
                                                                                                                   the
                       the special pair of vertices is labeled | and 2. These vertices are called the terminals of
                       graph. Also each edge is labeled according to its corresponding contact in Fig.     11.65.

(c)
                                  Figure 11.67

A one-terminal-pair-graph G is called a planar-one-terminal-pair-graph if G is planar,
                         and the resulting graph is also planar when an edge connecting the terminals is added to G.
                         Figure 11.67(b) shows this situation. Constructing a dual of part (b), we obtain the graph in
                         part (c) of the figure. Removal of the dotted edge results in the terminals |*, 2* for this dual,
                         which is a one-terminal-pair-graph. This graph provides the dual network in Fig. 11.67(d).
                                We make two observations in closing.
                                1) When the contacts at a, b, c are closed in the original network (Fig. 11.65), the light
                                   is on. In Fig. {1.67(b) the edges a, b, c, j form a cycle that includes the terminals.
                                                                                                                               11.4 Planar Graphs                553

In part (c) of the figure, the edges a*, b*, c*, j* form a cut-set disconnecting the
                                                     terminals 1*, 2*. Finally, with a’, b’, c’ open in part (d) of the figure, no current gets
                                                     past the first level of contacts (switches) and the light is off.
                                                  2) In like manner, the edges c, d, e, g, j form a cut-set that separates the terminals in
                                                     Fig. 11.67(b). (When the contacts at c, d, e, g are open in Fig. 11.65, the light is off.)
                                                     Figure 11.67(c) shows how c*, d*, e*, g*, j/* form a cycle that includes 1*, 2*. If c’,
                                                     d', e’, g’ are closed in part (d), current flows through the dual network and the light
                                                     is On.

9. How many paths of longest length are there in each of the
                                                                                    following graphs? (Remember thata path suchas v) > v2 > v3
                                                                                    is considered to be the same as the path v3; —                v2 >    v}.)
  1. Verify that the conclusion in Example 11.16 is unchanged
if Fig. 11.48(b) has edge {a, c} drawn in the exterior of the                           a) Kia                       b) K3.7                  ce) K7.12
pentagon.                                                                               d) K,,,, where m,n €Z* withm <n,

2. Show that when any edge is removed from Ks, the resulting                       10. Cana bipartite graph contain acycle of odd length? Explain.
subgraph is planar. Is this true for the graph K33?                                 11. Let G = (V, E) bea loop-free connected graph with |V| =
                                                                                    v. If |E| > (v/2)*, prove that G cannot be bipartite.
3. a) How many vertices and how many edges are there in
    the complete bipartite graphs K4.7, K7,;,, and K,,.,, where                     12. a) Find all the nonisomorphic complete bipartite graphs
    m,n, € Zt?                                                                          G = (V, E), where |V| = 6.

b) If the graph K,,12 has 72 edges, what is m?                                      b) How    many        nonisomorphic            complete   bipartite graphs
                                                                                        G = (V, E) satisfy |V| =n               > 2?
4. Prove that any subgraph of a bipartite graph is bipartite.
                                                                                    13. a) Let X = {1, 2, 3, 4, 5}. Construct the loop-free undi-
5. For each graph in Fig. 11.68 determine whether or not the                           rected graph G = (V, E) as follows:
graph is bipartite.
                                                                                         e (V): Let each two-element subset of X represent a ver-
6. Let n € Z* with n > 4. How many subgraphs of K, are                                    tex in G.
isomorphic to the complete bipartite graph K, 3?                                         e (F): If v;, v2 € V correspond to subsets {a, b} and
                                                                                           {c, d}, respectively, of X, then draw the edge {v), v2}
7. Let m,n € Z* with m >n > 2. (a) Determine how many
                                                                                           in G if {a, b} MN {e, d} = G.
distinct cycles of length 4 there are in K,,,. (b) How many
different paths of length 2 are there in Km.,? (c) How many                             b) To what graph is G isomorphic?
different paths of length 3 are there in K,, ,?                                     14. Determine which of the graphs in Fig. 11.69 are planar. If
                                                                                    a graph is planar, redraw it with no edges overlapping. If it is
8. What is the length of a longest path in each of the following
                                                                                    nonplanar, find a subgraph homeomorphic to either Ks or K3,3.
graphs?
                                                                                    15. Let m,n € Z*          with m <n.         Under what condition(s) on
    a)   K                         K                                      K
             Ki4         b)            K37                          ©)        Koi   m, nwillevery edge in K,,., be inexactly one of two isomorphic
    d) K,,.., where m,n € Z* withm                            <n.                   subgraphs of Kim»?

a                                    b
                                                                                                          a                        b
                                             Cc           d
                                                                                                                 c       Od

f       e
                                              e       f
                                                                                                          g                        A
                              g                                     A
                         (a)                                        (G)                             (c)                          (G’’)
                        Figure 11.68
554            Chapter 11 An Introduction to Graph Theory

a       5            c       d                                                              a

pf

e                             Cc             \

g               ho       |                                                                  C
                               (a)                                          (6)                                 (c)

a
                                                  a                               a                    b                                  b

f                                b                                              (A,

>                       °

e                                 C                                              ax

d                           u       VW     xX    Y        2
                                                                                                                      g               r
                               (d)                                          (e)                                 (f)
                          Figure 11.69

16. Prove that the Petersen graph is isomorphic to the graph in                                   gions. If, for some planar embedding of G, each region has at

Zh,
Fig. 11.70                                                                                        least five edges in its boundary, prove that |V| > 82.

g                          r                                           19. Let G = (V, E) be a loop-free connected 4-regular planar
                                                                                                  graph. If |E| = 16, how many regions are there in a planar de-
                                                                                                  piction of G?

20. Suppose that G = (V, E) is a loop-free planar graph with
                                                                                                  |V| = v, |E| = e, and«(G) = the number of components of G.
                                                                                                  (a) State and prove an extension of Euler’s Theorem for such
                                                                                                  a graph. (b) Prove that Corollary 11.3 remains valid if G is
                           y                          z                                           loop-free and planar but not connected.
                     Figure 11.70
                                                                                                  21. Prove that every loop-free connected planar graph has a
17. Determine the number of vertices, the number of edges, and                                    vertex uv with deg(v) < 6.
the number of regions for each of the planar graphs in Fig. 11.71.
                                                                                                  22. a) Let G = (V, E) be a loop-free connected graph with
Then show that your answers satisfy Euler’s Theorem for con-
                                                                                                       |V| > 11. Prove that either G or its complement G must be
nected planar graphs.
                                                                                                       nonplanar.
                                                                                                       b) The result in part (a) is actually true for |V| > 9, but the
                                                                                                       proof for |V| = 9, 10, is much harder. Find a counterexam-
                                                                                                       ple to part (a) for |V| = 8.
                                                                                                  23. a) Letk € Z*,k > 3. If G = (V, E) is aconnected planar
                                                                                                      graph with |V| = v, |Z| =e, and each cycle of length at
                                                                                                       least k, prove that e < (-*5) (v — 2).
                                                                                                       b) What is the minimal cycle length in K3.3?
                                                                                                       c) Use parts (a) and (b) to conclude that 3,3 is nonplanar.
      (a)                                   (b)
                                                                                                       d) Use part (a) to prove that the Petersen graph is non-
      Figure 11.71
                                                                                                       planar.
18.   Let   G =(V,   E)   be         an   undirected           connected         loop-free        24. a) Find a dual graph for each of the two planar graphs and
graph. Suppose further that G is planar and determines 53 re-                                         the one planar multigraph in Fig. 11.72.
                                                                                                                                                                                                                 11.4 Planar Graphs                   555

a                                              b

e
                   d
                                                 f                        g
                               A                                                                                                                           c                                            d                 y                Zz
      (a)
                                                                                                                                                     (a)                                                          (b)
        t          u
                                                                                                                                                     Figure 11.73

Vv     w            \                                        \                                                                                1) In Fig. 11.74 we split a vertex, namely r, of G and
                                                                m                 YS                                                                 obtain the graph H, which is disconnected.
              y                2
                                                                                                                                                     2) In Fig. 11.75 we obtain graph (d) from graph (a) by
                                                                                                                                                               i)           first splitting the two distinct vertices j and
      (b)                                            (c)
                                                                                                                                                                            q — disconnecting the graph,
    Figure 11.72                                                                                                                                            ii)             thenreflecting one subgraph about the horizon-
                                                                                                                                                                            tal axis, and
                                                                                                                                                           iii)             then identifying vertex j(q) in one subgraph
   b) Does the dual for the multigraph in part (c) have any                                                                                                                 with vertex g(j) in the other subgraph.
   pendant vertices? If not, does this contradict the third ob-
   servation made prior to Definition 11.20?                                                                                            Prove that the dual                                         graphs                obtained   in part (c) are 2-
                                                                                                                                        isomorphic.
25. a) Find duals for the planar graphs that correspond with
   the five Platonic solids.                                                                                                             p                                                      s                     p                           s
   b) Find the dual of the graph W,,, the wheel with n spokes
   (as defined in Exercise 14 of Section 11.1).                                                                                                                                                             —>

26. a) Show that the graphs in Fig. 11.73 are isomorphic.
   b) Draw a dual for each graph.                                                                                                       gq                          r                               t             q                  ror          t
   c) Show that the duals obtained in part (b) are not isomor-                                                                                                  (G)                                                                  (H)
   phic.                                                                                                                                Figure 11.74
   d) Two graphs G and H are called 2-isomorphic if one can
   be obtained from the other by applying either or both of the                                                                         e) For the cut-set {{a, b}, {c, b}, {d, b}} in part (a) of
   following procedures a finite number of times.                                                                                       Fig. 11.73, find the corresponding cycle in its dual. In the

joj
                                                                                  ®                           4                                                             >

(i)                                  \\                                          —>
                                                                                                                                                                                    (ii)
                                                                                               —

e             .                               «
                                      np                        q             rs                              n             Pq               g         r                S
                                   (a)                                                                        (b)
                                             /                                         r             Ss                      !                   /                          r              Ss
                                         4                                                                >                                      ?                      »
                                                                                                                    (iii)
                                                                                                                    —

r                                                             >                                o—¢                                     6
                                         nm                p   @q       {jf                k         om                      nm         ep  q                                   k          om
                                       (c)                                                                                  (d)
                                   Figure 11.75
556            Chapter 11 An Introduction to Graph Theory

dual of the graph in Fig. 11.73(b), find the cut-set that cor-
      responds with the cycle {w, z}, {z, x}. {x, y}, {y, w} in the                         Aw,
                                                                                             d
      given graph.
                                                                                                b   e   ||
27. Find the dual network for the electrical network shown in
Fig. 11.76.
28. Let G = (V, E) be a loop-free connected planar graph. If                 T         EWTN
G is isomorphic to its dual and |V| = 2, what is | E|?                                      A
29. Let G,, G2 be two loop-free connected undirected graphs.
If Gy, G2 are homeomorphic, prove that (a) G;, G2 have the                   Figure 11.76
same number of vertices of odd degree; (b) G; has an Euler
trail if and only if G, has an Euler trail; and (c) G, has an Euler
circuit if and only if G2 has an Euler circuit.

11.5
             Hamilton Paths and Cycles
                                 In 1859 the Irish mathematician Sir William Rowan Hamilton (1805-1865) developed a
                                 game that he sold to a Dublin toy manufacturer. The game consisted of a wooden regular
                                 dodecahedron with the 20 corner points (vertices) labeled with the names of prominent
                                 cities. The objective of the game was to find a cycle along the edges of the solid so that each
                                 city was on the cycle (exactly once). Figure 11.77 is the planar graph for this Platonic solid:
                                 such a cycle is designated by the darkened edges. This illustration leads us to the following
                                 definition.

Figure 11.77

Definition 11.21           If G = (V, E) is a graph or multigraph with |V| > 3, we say that G has a Hamilton cycle
                                 if there is a cycle in G that contains every vertex in V. A Hamilton path is a path (and not
                                 a cycle) in G that contains each vertex.

Given a graph with a Hamilton cycle, we find that the deletion of any edge in the cycle
                                 results in a Hamilton path. It is possible, however, for a graph to have a Hamilton path
                                 without having a Hamilton cycle.
                                     It may seem that the existence of a Hamilton cycle (path) and the existence of an Euler
                                 circuit (trail) for a graph are similar problems. The Hamilton cycle (path) is designed to
                                 visit each vertex in a graph only once; the Euler circuit (trail) traverses the graph so that
                                 each edge is traveled exactly once. Unfortunately, there is no helpful connection between
                                 the two ideas, and unlike the situation for Euler circuits (trails), there do not exist necessary
                                                                               11.5 Hamilton Paths and Cycles    557

and sufficient conditions on a graph G that guarantee the existence of a Hamilton cycle
                   (path). If a graph has a Hamilton cycle, then it will at least be connected. Many theorems
                   exist that establish either necessary or sufficient conditions for a connected graph to have a
                   Hamilton cycle or path. We shall investigate several of these results later. When confronted
                   with particular graphs, however, we shall often resort to trial and error, with a few helpful
                   observations.

Referring back to the hypercubes in Fig. 11.35 we find in Q> the cycle
| EXAMPLE 11.26
                                                 00 -—->       10 —    1! —> 01 —       00

and in Q3 the cycle

000 ——    100 —>    110 —> 010 —>           011 —>   111 —>     101 —> 001 —~> 000.

Hence Q>2 and Q3 have Hamilton cycles (and paths). In fact, for all n > 2, we find that Q,
                   has a Hamilton cycle. (The reader is asked to establish this in the Section Exercises.) [Note,
                   in addition, that the listings: 00, 10, 11, 01 and 000, 100, 110, 010, 011, 111, 101, 001 are
                   examples of Gray codes (which were introduced in Example 3.9).]

If G is the graph in Fig. 11.78, the edges {a, b}, {b. c}, {c. f}, {f. ef, fe. d}, {d, g}, {g. A},
   EXAMPLE 11.27
                   {h, i} yield a Hamilton path for G. But does G have a Hamilton cycle?

6Sy
                                                           ran)

ea
                                                                       oD

g           bh         i
                                                    Figure 11.78

Since G has nine vertices, if there is a Hamilton cycle in G it must contain nine edges.
                   Let us start at vertex b and try to build a Hamilton cycle. Because of the symmetry in the
                   graph, it doesn’t matter whether we go from b to c or to a. We’ll go to c. At c we can go
                   either to f or to i. Using symmetry again, we go to f. Then we delete edge {c, i} from
                   further consideration because we cannot return to vertex c. In order to include vertex i in
                   our cycle, we must now go from f toi (to h to g). With edges {c, f} and {f, i} in the
                   cycle, we cannot have edge {e, f} in the cycle. [Otherwise, in the cycle we would have
                   deg(f) > 2.] But then once we get to e we are stuck. Hence there is no Hamilton cycle for
                   the graph.

Example 11.27 indicates a few helpful hints for trying to find a Hamilton cycle in a graph
                   G =(V, E).
                      1) If G has a Hamilton cycle, then for all v € V, deg(v) > 2.
                      2) If a € V and deg(a) = 2, then the two edges incident with vertex a must appear in
                         every Hamilton cycle for G.
558         Chapter 11 An Introduction to Graph Theory

3) If ae    V and deg(a)   > 2, then as we try to build a Hamilton      cycle, once we pass
                                    through vertex a, any unused edges incident with a are deleted from further consid-
                                    eration.
                                4) In building a Hamilton cycle for G, we cannot obtain a cycle for a subgraph of G
                                   unless it contains all the vertices of G.

Our next example provides an interesting technique for showing that a special type of
                             graph has no Hamilton path.

In Fig. 11.79(a) we have a connected graph G, and we wish to know whether G contains
      EXAMPLE 11.28
                             a Hamilton path. Part (b) of the figure provides the same graph with a set of labels x, y.
                             This labeling is accomplished as follows: First we label vertex a with the letter x. Those
                             vertices adjacent to a (namely, b, c, and d) are then labeled with the letter y. Then we label
                             the unlabeled vertices adjacent to b, c, or d with x. This results in the label x on the vertices
                             é, g, and i. Finally, we label the unlabeled vertices adjacent to e, g, or i with the label y. At
                             this point, all the vertices in G are labeled. Now, since | V| = 10, if G is to have a Hamilton
                             path there must be an alternating sequence of five x’s and five y’s. Only four vertices are
                             labeled with x, so this is impossible. Hence G has no Hamilton path (or cycle).

)                J
                                      Figure 11.79

But why does this argument work here? In part (c) of Fig. 11.79 we have redrawn the
                             given graph, and we see that it is bipartite. From Exercise !0 in the previous section we
                             know that a bipartite graph cannot have a cycle of odd length. It is also true that if a graph
                             has no cycle of odd length, then it is bipartite. (The proof is requested of the reader in
                             Exercise 9 of this section.) Consequently, whenever a connected graph has no odd cycle
                             (and is bipartite), the method described above may be helpful in determining when the graph
                             does not have a Hamilton path. (Exercise 10 in this section examines this idea further.)

Our next example provides an application that calls for Hamilton cycles in a complete
                             graph.

At Professor Alfred’s science camp, 17 students have lunch together each day at a circular
      EXAMPLE 11.29
                             table. They are trying to get to know one another better, so they make an effort to sit next to
                             two different colleagues each afternoon. For how many afternoons can they do this? How
                             can they arrange themselves on these occasions?
                                 To solve this problem we consider the graph K,,, where n > 3 and is odd. This graph
                             has n vertices (one for each student) and (5) = n(n — [)/2 edges. A Hamilton cycle in K,
                                                                            11.5 Hamilton Paths and Cycles         559

corresponds to a seating arrangement. Each of these cycles has n edges, so we can have at
                  most (1/ n)(3) = (n — 1)/2 Hamilton cycles with no two having an edge in common.
                      Consider the circle in Fig. 11.80 and the subgraph of K,, consisting of the n vertices and
                  the n edges {I, 2}, {2, 3},..., {n — |, n}, {n, 1}. Keep the vertices on the circumference
                  fixed and rotate this Hamilton cycle clockwise through the angle [1/(n — 1)](2z). This
                  gives us the Hamilton cycle (Fig. 11.8!) made up of edges {!, 3}, {3, 5}, {5, 2}, {2, 7},...,
                  {n,n — 3}, {2 — 3, n — 1}, {n — 1, 1}. This Hamilton cycle has no edge in common with
                  the first cycle. When n > 7 and we continue to rotate the cycle in Fig. 11.80 in this way
                  through angles [K/(n — 1)](27), where 2 <k < (n — 3)/2, we obtain a total of (n — 1)/2
                  Hamilton cycles, no two of which have an edge in common.

Figure 11.80                               Figure 11.81

Therefore the 17 students at the science camp can dine for [(17 — 1)/2] = 8 days before
                  some student will have to sit next to another student for a second time. Using Fig. 11.80
                  with n = 17, we can obtain eight such possible arrangements.

We turn now to some further results on Hamilton paths and cycles. Our first result was
                  established in 1934 by L. Redei.

THEOREM 11.7      Let K¥ be a complete directed graph — that is, K;* has n vertices and for each distinct pair
                  x, y of vertices, exactly one of the edges (x, y) or (y, x) is in K;**. Such a graph (called a
                  tournament) always contains a (directed) Hamilton path.
                  Proof: Let m >2      with     p,,   a path containing   the m—    1 edges   (v1, v2), (v2, v3), ...,
                  (Um_—1, Um). If m =n, we're finished. If not, let v be a vertex that doesn’t appear in p,,.
                     If (v, v;) is an edge in K;*, we can extend p,, by adjoining this edge. If not, then (v1, v)
                  must be an edge. Now suppose that (v, v2) is in the graph. Then we have the larger path:
                  (v1, Vv), (UV, U2), (V2, U3). .-., (Um—1, Um). Tf (v, v2) is not an edge in K;*, then (v2, v) must
                  be. As we continue this process there are only two possibilities: (a) Forsome 1 < k <m — |
                  the edges (uz. v), (UV, Ug41) are in K* and we replace (vy, vg41) with this pair of edges; or
                  (b) (Un, v) is in K* and we add this edge to p,,. Either case results in a path p,,4, that
                  includes m + | vertices and has m edges. This process can be repeated until we have such
                  a path p, on n vertices.

In a round-robin tournament each player plays every other player exactly once. We want to
  EXAMPLE 11.30
                  somehow rank the players according to the results of the tournament. Since we could have
                  players a, b, and c where a beats 5 and b beats c, but c beats a, it is not always possible
                  to have a ranking where a player in a certain position has beaten all of the opponents in
560            Chapter 11 An Introduction to Graph Theory

later positions. Representing the players by vertices, construct a directed graph G on these
                                vertices by drawing edge (x, y) if x beats y. Then by Theorem 11.7, it is possible to list the
                                players such that each has beaten the next player on the list.

THEOREM 11.8                    Let G = (V, E) be aloop-free graph with |V| =” > 2. If deg(x) + deg(y)                => n — 1 forall
                                x, yeEV,x #y, then G has a Hamilton path.
                                Proof: First we prove that G is connected. If not, let C;, C2 be two components of G and
                                let x, y € V with x a vertex in C, and y a vertex in C. Let C; have n; vertices, i = 1, 2.
                                Then deg(x) <n, — 1, deg(y) <n2 — 1, and deg(x) + deg(y) < (11 +12) —2 <n —-2,
                                contradicting the condition given in the theorem. Consequently, G is connected.
                                   Now we build a Hamilton path for G. For m > 2, let p», be the path {v), v2}, {v2. v3}.
                                .. +, {Um—1. Um}      Of length m — 1. (We relabel vertices     if necessary.) Such   a path exists,
                                because for m = 2 all that is needed is one edge. If v; is adjacent to any vertex v other
                                than v2, v3, ..., Um, we add the edge {v, v)} to p» to get P41. The same type of pro-
                                cedure   is carried    out if vy, is adjacent   to a vertex   other than   vj), v2,..., Um—\.   If we
                                are able to enlarge p,, to p, in this way, we get a Hamilton path. Otherwise the path
                                Pm. {V1. V2}. .... {Um—1, Um} has v1, UV» adjacent only to vertices in p»,, and m <n. When
                                this happens we claim that G contains a cycle on these vertices. If v; and u,, are adja-
                                cent, then the cycle is {v1, v2}, {v2, v3}, .... (Um—-1. Um}, {Um. vi}. If vy and v,», are not
                                adjacent, then v; is adjacent to a subset S of the vertices in {v2, v3,..., Um —1}. If there
                                is a vertex v; € S such that v,, is adjacent to v,_;, then we can get the cycle by adding
                                {vy, v:}, {Up-1. Um} to p», and deleting {v,_), v,} as shown in Fig. 11.82. If not, let |$| =
                                k <m — |. Then deg(v,) = & and deg(v,,) < (m — 1) — k, and we have the contradiction
                                deg(v,) + deg(vm,) <m — 1 <n -— 1. Hence there is a cycle connecting vj, v2, .... Um-

vy

(b)
Figure 11.82                                                          Figure 11.83

Now consider a vertex v € V that is not found on this cycle. The graph G is connected,
                                so there is a path from v to a first vertex v, in the cycle, as shown in Fig. 11.83(a). Removing
                                the edge {v,_1, v-} (or {v,, v,} ify = £), we get the path (longer than the original p,,) shown
                                in Fig. 11.83(b). Repeating this process (applied to p,,) for the path in Fig. 11.83(b), we
                                continue to increase the length of the path until it includes every vertex of G.
                                                                           11.5 Hamilton Paths and Cycles      561

COROLLARY 11.4   Let G = (V, E) be a loop-free graph with n (> 2) vertices. If deg(v) > (mn — 1)/2 for all
                 v € V, then G has a Hamilton path.
                 Proof: The proof is left as an exercise for the reader.

Our last theorem for this section provides a sufficient condition for the existence of a
                 Hamilton cycle in a loop-free graph. This was first proved by Oystein Ore in 1960.

THEOREM 11.9     Let G = (V, E) bea loop-free undirected graph with |V| =n > 3. Ifdeg(x) + deg(y) >n
                 for all nonadjacent x, y € V, then G contains a Hamilton cycle.
                 Proof: Assume that G does not contain a Hamilton cycle. We add edges to G until we arrive
                 at a subgraph H of K,,, where H has no Hamilton cycle, but, for any edge e (of K,,) not in
                 H, H + e does have a Hamilton cycle.
                    Since H    # K,,, there are vertices a, b € V, where {a, b} is not an edge of H but H +
                 {a, b} has a Hamilton cycle C. The graph H has no such cycle, so the edge {a, b} is a part
                 of cycle C. Let us list the vertices of H (and G) on cycle C as follows:
                                    C=      V1)   9 B=    02)   9 03>   Ug >      > Unt     >   On

Foreach3     <i   <n, ifthe edge {b, v;} is inthe graph H, then we claim that the edge {a, v,_;}
                 cannot be an edge of H. For if both of these edges are in H, for some 3 <i <n, then we
                 get the Hamilton cycle
                        CO     PUPOi        P      PP nn        FP On FEU            Bi -2F     *    VGE PVD

for the graph H (which has no Hamilton cycle). Therefore, for each 3 <i <n, at most one
                 of the edges {b, v;}, {a. v;_,} is in H. Consequently,

deg, (a) + deg, (b) <n,

where deg,,(v) denotes the degree of vertex v in graph H. For all uv € V, deg, (v) >
                 deg, (v) = deg(v), so we have nonadjacent (in G) vertices a, b, where

deg(a) + deg(b) <n.

This contradicts the hypothesis that deg(x) + deg(y) > n for all nonadjacent x, y € V,
                 sO we reject our assumption and find that G contains a Hamilton cycle.

Now we shall obtain the following two results from Theorem 11.9. Each will give us a
                 sufficient condition for a loop-free undirected graph G = (V, E) to have a Hamilton cycle.
                 The first result is similar to Corollary 11.4 and is concerned with the degree of each vertex
                 v in V. The second result examines the size of the edge set F.

COROLLARY 11.5   If G = (V, E) is a loop-free undirected graph with |V| = n > 3, and if deg(v) > n/2 for
                 all v € V, then G has a Hamilton cycle.
                 Proof: We shall leave the proof of this result for the Section Exercises.
562          Chapter 11 An Introduction to Graph Theory

COROLLARY 11.6                 If G = (V, E) is a loop-free undirected graph with |V| = n > 3, and if |E| > (” 2 ') + 2,
                               then G has a Hamilton cycle.
                               Proof: Let a, b € V, where {a, b} € FE. [Since a, b are nonadjacent, we want to show that
                               deg(a) + deg(b) > n.] Remove the following from the graph G: (i) all edges of the form
                               {a, x}, where x € V; (ii) all edges of the form {y, b}, where y € V; and (iii) the vertices a
                               and b. Let H = (V’, E’) denote the resulting subgraph. Then |£| = |E’| + deg(a) + deg(b)
                               because {a, b} ¢ E.
                                   Since |V’| =    — 2, H is a subgraph of the complete graph K,_2, so |E’| < ("5’).
                               Consequently, (” 3 ') +2<|E| = |E’| + deg(a) + deg(b) < ("3°) + deg(a) + deg(b),
                               and we find that

seat)
                                             + dee) ("5") 42-("52)
                                                               = (;)       (n — 1)    -—2)4+2—- (;)        (n — 2)(n
                                                                                                                  — 3)

-(5)        (n — 2)[( — 1) —- (n — 3)) +2

-(5)        (n — 2)(2) +2 = (n-2)42=n.

Therefore it follows from Theorem [1.9 that the given graph G has a Hamilton cycle.

A problem that is related to the search for Hamilton cycles in a graph is the traveling
                                salesman problem. (An article dealing with this problem was published by Thomas P. Kirk-
                                man in 1855.) Here a traveling salesperson leaves his or her home and must visit certain
                                locations before returning. The objective is to find an order in which to visit the locations
                                that is most.efficient (perhaps in terms of total distance traveled or total cost). The problem
                                can be modeled with a labeled (edges have distances or costs associated with them) graph
                                where the most efficient Hamilton cycle is sought.
                                   The references by R. Bellman,       K. L. Cooke,      and J. A. Lockett     [7]; M. Bellmore and
                                G.L. Nemhauser [8]; E. A. Elsayed [15]; E. A. Elsayed and R. G. Stern [16]; and L. R. Foulds
                                [17] should prove interesting to the reader who wants to learn more about this important
                                optimization problem. Also, the text edited by E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy
                                Kan, and D. B. Shmoys      [22] presents    12 papers on various facets of this problem.
                                    Even more on the traveling salesman problem and its applications can be found in the
                                handbooks edited by M. O. Ball, T. L. Magnanti, C. L. Monma, and G. L. Nemhauser— in
                                particular, the articles by R. K. Ahuja, T. L. Magnanti, J. B. Orlin, and M. R. Reddy [2],
                                and by M. Jiinger, G. Reinelt, and G. Rinaldi [21].

3. Find a Hamilton cycle, if one exists, for each of the graphs
                        oA                                            or multigraphs in Fig. 11.84. If the graph has no Hamilton cycle,
                                                                      determine whether it has a Hamilton path.
  1. Give an example of a connected graph that has (a) Neither
an Euler circuit nor a Hamilton cycle. (b) An Euler circuit but        4. a) Show that the Petersen graph [Fig. 11.52(a)] has no
no Hamilton cycle. (c) A Hamilton cycle but no Euler circuit.             Hamilton cycle but that it has a Hamilton path.
(d) Both a Hamilton cycle and an Euler circuit.
                                                                            b) Show that if any vertex (and the edges incident to it) is
2. Characterize the type of graph in which an Euler trail (cir-            removed from the Petersen graph, then the resulting sub-
cuit) is also a Hamilton path (cycle).                                      graph has a Hamilton cycle.
                                                                                                     11.5 Hamilton Paths and Cycles          563

(d)                                   (e)
                        Figure 11.84

5. Consider the graphs in parts (d) and (e) of Fig. 11.84. Is it              11. a) Determine all nonisomorphic         tournaments with three
possible to remove one vertex from each of these graphs so that                    vertices.
each of the resulting subgraphs has a Hamilton cycle?                                b) Find all of the nonisomorphic tournaments with four
6. If n > 3, how many different Hamilton cycles are there in                        vertices. List the in degree and the out degree for each ver-
the wheel graph W,,? (The graph W,, was defined in Exercise 14                       tex, in each of these tournaments.
of Section 11.1.)                                                              12. Prove that for n > 2, the hypercube       Q, has a Hamilton
                                                                               cycle.
7. a) Forn > 3, how many different Hamilton cycles are there
    in the complete graph K,,?                                                 13. Let T = (V, E) be a tournament with v € V of maximum
                                                                               out degree. Ifw € V and w # v, prove that either (v, w) € FE or
      b) How many edge-disjoint Hamilton cycles are there in
                                                                               thereis a vertex yin V wherey # v, w, and (v, y), (y, w) € E.
                                                                               (Such a vertex v is called a king for the tournament.)
      c) Nineteen students in a nursery school play a game each
      day where they hold hands to form a circle. For how many                 14. Find a counterexample to the converse of Theorem 11.8.
      days can they do this with no student holding hands with                 15. Give an example of a loop-free connected undirected multi-
      the same playmate twice?                                                 graph G = (V, E) such that |V| =” and deg(x) + deg(y) =
8. a) For n € Z*+, n > 2, show that the number of distinct                    n — 1 forall x, y € V, but G has no Hamilton path.
      Hamilton cycles in the graph K,,, is (1/2)(n — 1)! a!                    16.   Prove Corollaries   11.4 and 11.5.
      b) How many different Hamilton paths are there for K,, »,                17. Give an example to show that the converse of Corollary 11.5
      n>   1?                                                                  need not be true.
  9. Let G = (V, E) bea loop-free undirected graph. Prove that                 18. Helen and Dominic invite 10 friends to dinner. In this group
if G contains no cycle of odd length, then G is bipartite.                     of 12 people everyone knows at least 6 others. Prove that the
10.   a) Let    G = (V, E)         be   a connected   bipartite   undirected   12 can be seated around a circular table in such a way that each
      graph with V partitioned as V; U V3. Prove that if |Vi| #                person is acquainted with the persons sitting on either side.
      |V>|, then G cannot have a Hamilton cycle.                               19. Let G = (V, E) be a loop-free undirected graph that is 6-
      b) Prove that if the graph G in part (a) has a Hamilton path,            regular. Prove that if |V| = 11, then G contains a Hamilton
      then |V;| — |V2, =+1.                                                    cycle.
      c) Give an example of a connected bipartite undirected                   20. Let G = (V, E) be a loop-free undirected n-regular graph
      graph G = (V, E), where V is partitioned as V; U V2 and                  with |V| > 2” + 2. Prove that G (the complement of G) has a
      |V;| = |V¥2| — 1, but G has no Hamilton path.                            Hamilton cycle.
564             Chapter 11 An Introduction to Graph Theory

21. For n > 3, let C, denote the undirected cycle on n ver-             b) Find £(G) for each graph in part (a).
tices. The graph C,,, the complement of C,,, is often called the        c) Determine        £(G)         for each       of the following        graphs:
cocycle on n vertices. Prove that for 2 > 5 the cocycle C,, has
                                                                        Gi) Ky3; (ii) Kz3;                (iii)    K32;    (iv)   Keas   (Vv)     Kae:
a Hamilton cycle.
                                                                        (Vi) Knn,mne Ze.
22. Letn € Z* withn > 4, and let the vertex set V’ for the com-
                                                                        d) Let 7 be an independent set in G = (V, E). What type
plete graph K,_; be {v), v2, U3, ..., Un_1}. Now construct the
                                                                        of subgraph does / induce in G?
loop-free undirected graph G, = (V, E) from K,,_, as follows:
V = V'U {v}, and E consists of all the edges in K,_) except for
                                                                                        b
the edge {v), v2}, which is replaced by the pair of edges {v,, v}
and {v, v}.
      a) Determine deg(x) + deg(¥) for all nonadjacent vertices
      x and yin V.
      b) Does G,, have a Hamilton cycle?
      c) How large is the edge set E?
      d) Do the results in parts (b) and (c) contradict Corol-           (i             0                         (i)
      lary 11.6?
                                                                        Figure 11.85
23. For n € Z* where n > 4, let V’ = {v), v2. U3, -.., Up_y}
be the vertex set for the complete graph K,_). Construct the
loop-free undirected graph H, = (V, £) from K,,_, as follows:       26. Let G = (V, E) be an undirected graph with subset J of
V =V’U {v}, and & consists of all the edges in K,,_; together       V an independent set. For each a € J and each Hamilton cy-
with the new edge {v, v;}.                                          cle C for G, there will be deg(a) — 2 edges in E that are
                                                                    incident with @ and not in C. Therefore there are at least
      a) Show    that H,, has a Hamilton path but no Hamilton
                                                                    > .c/[deg(a) — 2) = )-,., deg(a) — 2|/| edges in E that do
      cycle.
                                                                    not appear in C.
      b) How large is the edge set FE?
                                                                        a) Why are these }),., deg(a) — 2|1| edges distinct?
24. Letn = 2‘ fork € Z*. We use the n k-bit sequences (of 0’s
                                                                        b) Letv = |V|, e = |E|. Prove that if
and 1’s) to represent 1, 2,3,...,, so that for two consecu-
tive integers i, i + 1, the corresponding k-bit sequences differ                            e— )*deg(a) + 2|/| <v,
in exactly one component. This representation is called a Gray                                     aél
code (comparable to what we saw in Example 3.9).
                                                                        then G has no Hamilton cycle.
      a) For k = 3, use a graph model with V = {000, 001,
                                                                        c) Select a suitable independent set / and use part (b) to
      010,..., 111} to find such a code for 1, 2,3,...,8.
                                                                        show that the graph in Fig. 11.86 (known as the Herschel
      How is this related to the concept of a Hamilton path?
                                                                        graph) has no Hamilton cycle.
      b) Answer part (a) for k = 4.
25. If G = (V, £) is an undirected graph, a subset / of V is
called independent if no two vertices in J are adjacent. An in-
dependent set / is called maximal if no vertex v can be added
to J with J U {v} independent. The independence number of G,
denoted £(G), is the size of a largest independent set in G.
                                                                                       Figure 11.86
      a) For each graph in Fig. 11.85 find two maximal indepen-
      dent sets with different sizes.

11.6
                  Graph Coloring
            and Chromatic Polynomials
                                 At the J. & J. Chemical Company, Jeannette is in charge of the storage of chemical com-
                                 pounds in the company warehouse. Since certain types of compounds (such as acids and
                                 bases) should not be kept in the same vicinity, she decides to have her partner Jack par-
                                                              11.6 Graph Coloring and Chromatic Polynomials       565

tition the warehouse into separate storage areas so that incompatible chemical reagents
                      can be stored in separate compartments. How can she determine the number of storage
                      compartments that Jack will have to build?
                         If this company sells 25 chemical compounds, let {c;, c2...., C75} = V, asetof vertices.
                      For all | <i < j <25, we draw the edge {c;, c;} if c, and c, must be stored in separate
                      compartments. This gives us an undirected graph G = (V, E).
                         We now introduce the following concept.

Definition 11.22   If G = (V, £) is an undirected graph, a proper coloring of G occurs when we color the
                      vertices of G so that if {a, b} is an edge in G, then a and b are colored with different colors.
                      (Hence adjacent vertices have different colors.) The minimum number of colors needed to
                      properly color G is called the chromatic number of G and is written x (G).

Returning to assist Jeannette at the warehouse, we find that the number of storage
                      compartments Jack must build is equal to x(G) for the graph we constructed on V =
                      {c1, C2,..., €25}. But how do we compute x (G)? Before we present any work on how to
                      determine the chromatic number of a graph, we turn to the following related idea.
                          In Example 11.24 we mentioned the connection between coloring the regions in a planar
                      map (with neighboring regions having different colors) and properly coloring the vertices
                      in an associated graph. Determining the smallest number of colors needed to color planar
                      maps in this way has been a problem of interest for over a century.
                          In about 1850, Francis Guthrie (1831-1899) became interested in the general problem
                      after showing how to color the counties on a map of England with only four colors. Shortly
                      thereafter, he showed the “Four-color Problem” to his younger brother Frederick (1833-
                      1866), who was then a student of Augustus DeMorgan (1806-1871). DeMorgan communi-
                      cated the problem (in 1852) to William Hamilton (1805-1865). The problem did not interest
                      Hamilton and lay dormant for about 25 years. Then, in 1878, the scientific community was
                      made aware of the problem through an announcement by Arthur Cayley (1821-1895) at a
                      meeting of the London Mathematical Society. In [879 Cayley stated the problem in the first
                      volume of the Proceedings of the Royal Geographical Society. Shortly thereafter, the British
                      barrister (and keen amateur mathematician) Sir Alfred Kempe (1849-1922) devised a proof
                      that remained unquestioned for over a decade. In 1890, however, the British mathematician
                      Percy John Heawood (1861-1955) found a mistake in Kempe’s work.
                         The problem remained unsolved until 1976, when it was finally settled by Kenneth
                      Appel and Wolfgang Haken. Their proof employs a very intricate computer analysis of
                      1936 (reducible) configurations.

Although only four colors are needed to properly color the regions in a planar map, we
                      need more than four colors to properly color the vertices of some nonplanar graphs.
                          We start with some small examples. Then we shall find a way to determine x (G) from
                      smaller subgraphs of G —in certain situations. [In general, computing x(G) is a very
                      difficult problem.] We shall also obtain what is called the chromatic polynomial for G and
                      see how it can be used in computing x (G).

| EXAMPLE 11.31       For the graph G in Fig. 11.87, we start at vertex a and next to each vertex write the number
                      of a color needed to properly color the vertices of G that have been considered up to that
                      point. Going to vertex b, the 2 indicates the need for a second color because vertices a
                      and b are adjacent. Proceeding alphabetically to f, we find that two colors are needed to
566         Chapter 11 An Introduction to Graph Theory

properly color {a, b, c,d, e, f}. For vertex g a third color is needed; this third color can
                             also be used for vertex h because {g, h} is not an edge in G. Thus this sequential coloring
                             (labeling) method gives us a proper coloring for G, so x(G) < 3. Since K3 is a subgraph
                             of G   [for example,   the subgraph    induced by a, b and g is (isomorphic to) K3], we have
                             x(G) > 3, so x(G) = 3.

AR
                                                              é,1          |
                                                                          ,2
                                                               Figure 11.87

a) For alln > 1, x(K,) =n.
      EXAMPLE 11.32
                               b) The chromatic number of the Herschel graph (Fig. 11.86) is 2.
                               c) If G is the Petersen graph [see Fig. 11.52 (a)], then x(G) = 3.

Let G be the graph shown in Fig. 11.88. For U = {b, f, h, i}, the induced subgraph (U)
      EXAMPLE 11.33
                             of G is isomorphic to Ky, so x(G) > x (K4) = 4. Therefore, if we can determine a way to
                             properly color the vertices of G with four colors, then we shall know that x (G) = 4. One
                             way to accomplish this is to color the vertices e, f, g blue; the vertices b, j red; the vertices
                             c, h white; and the vertices a, d, i green.

a          b                c
                                                                                                    a2.

h                    i                j
                                                          Figure 11.88

We turn now to a method for determining x (G). Our coverage follows the development
                             in the survey article [25] by R. C. Read.
                                 Let G be an undirected graph, and let 4 be the number of colors that we have available
                             for properly coloring the vertices of G. Our objective is to find a polynomial function
                             P(G, 4), in the variable 1, called the chromatic polynomial of G,            that will tell us in how
                             many different ways we can properly color the vertices of G, using at most A colors.
                                Throughout this discussion, the vertices in an undirected graph G = (V, E) are distin-
                             guished by labels. Consequently, two proper colorings of such a graph will be considered
                             different in the following sense: A proper coloring (of the vertices of G) that uses at most A
                             colors is a function f, with domain V and codomain {1, 2, 3,..., A}, where f(u) # f(v),
                                                           11.6 Graph Coloring and Chromatic Polynomials    567

for adjacent vertices u, v € V. Proper colorings are then different in the same way that these
                  functions are different.

a) If G = (V, E) with |V| =n and E = Y, then G consists of n isolated points, and by
EXAMPLE   11.34        the rule of product, P(G, A) = A".
                    b) lf G = K,, then at least n colors must be available for us to color G properly. Here,
                       by the rule of product, P(G, 4) = A(A — 1)(A — 2)--- (A —n 4+ 1), which we de-
                        note by 4°, For A <n, P(G, 4) = 0 and there are no ways to properly color Ky.
                        P(G, 4) > 0 for the first time when A = n = x(G).
                    c) For each path in Fig. I!.89, we consider the number of choices (of the 4 colors) at
                       each successive vertex. Proceeding alphabetically, we find that P(G;, 4) = A(A — 1)
                       and P(G2, A) =A(A — 1)*. Since P(G;. 1) = 0 = P(Go, 1), but P(G,, 2) =2 =
                       P(G2, 2), it follows that ¥(G;) = ~(G2) = 2. If five colors are available we can
                       properly color G; in 5(4)> = 320 ways; G2 can be so colored in 5(4)* = 1280 ways.

adxy-1    a,X
                                                                   er-1                  brA—1

CA   1   bX - 1             drx»-1    ¢A-1

(G1)                        (G>)
                                   Figure 11.89

In general, if G is a path on n vertices, then P(G, A) = A(A — 1)"7!.

d) If G is made up of components G;, G2, .... G,, then again by the rule of product, it
                       follows that P(G, 24) = P(G;, A)- P(G2, A) --+ P(G,, A).

As a result of Example 11.34(d), we shall concentrate on connected graphs. In many
                  instances in discrete mathematics, methods have been employed to solve problems in large
                  cases by breaking these down into two or more smaller cases. Once again we use this method
                  of attack. To do so, we need the following ideas and notation.
                      Let G = (V, E) be an undirected graph. For e = {a, b} € E, let G, denote the subgraph
                  of G obtained by deleting e from G, without removing vertices a and b; that is, G, = G — e
                  as defined in Section 11.2. From G, a second subgraph of G is obtained by coalescing (or,
                  identifying) the vertices a and b. This second subgraph is denoted by G’..

EXAMPLE   11.35   Figure 11.90 shows G, and G’, for graph G with the edge e as specified. Note how the
            .     coalescing of a and b in G/ results in the coalescing of the two pairs of edges {d, b}, {d, a}
                  and {a, c}, {b, c}.
568         Chapter 11 An Introduction to Graph Theory

a                         Cc   a                  Cc   a (=p)             C

e

d                     b        d                  b     d
                                                                 G                             Ge                    Ge

Figure 11.90

Using these special subgraphs, we turn now to the main result.

THEOREM 11.10                Decomposition Theorem for Chromatic Polynomials. If G = (V, E) is a connected graph
                             and e € £,   then

P(G,, A) = P(G,A) + P(G!, A).

Proof: Let e = {a, b}. The number of ways to properly color the vertices in G, with (at
                             most) A colors is P(G,, 4). Those colorings where a and b have different colors are proper
                             colorings of G. The colorings of G, that are not proper colorings of G occur when a and
                             b have the same color. But each of these colorings corresponds with a proper coloring for
                             G‘,. This partition of the P(G,. A) proper colorings of G, into two disjoint subsets results
                             in the formula P(G,, A) = P(G.A) + P(G%, d).

When calculating chromatic polynomials, we shall place brackets about a graph to indi-
                             cate its chromatic polynomial.

The following calculations yield P(G, 4) for G a cycle of length 4.
      EXAMPLE 11.36

a             b                          a                  Db                 a        b (=d)
                                                                                            o——__-e

é             =                                 —

o—_—_-e
                                           C             d                              C              d                  C

P(G, d)                                      P(Ge, d)                          P(G,, d)

From Example 11.34(c) it follows that P(G,, 4) = A(A — 1)°. With G', = K3 we have
                             P(G), A) =A. Therefore,
                                      P(G,A) =AQA—                   1 —AA-                 DOA -2) =AaQAa—- IIA -1)*% — A —2)]
                                                  =A(A — IA? — 30.43] = a4 ~— 403 4 6A? — 3d.
                                Since P(G, 1) = 0 while P(G, 2) = 2 > 0, we know that ¥(G) = 2.
                                                                 11.6 Graph Coloring and Chromatic Polynomials         569

Here we find a second application of Theorem 11.10.
   EXAMPLE 11.37

Vv                         Vv

e<
                                  e
                        Ww            x          w 7   ey    x     Ww     x (=v)        w         x        w       x

KAUN                                       RN                               NS
                                  = (AA)          — 2014 = (A — 2)         = AA — - TDA — 2)2(4 - 3)
                       For the disconnected graph
                       with the components Kj), Kq

Foreach!        <A   <3,     P(G, 4) = 0, but P(G, A) > OforallA > 4. Consequently, the given
                   graph has chromatic number 4.

The chromatic polynomials given in Examples                 11.36 and [1.37 suggest the following
                   results,

THEOREM 11.11      For each graph G, the constant term in P(G, A) is 0.
                   Proof: For each graph G, x¥(G) > 0 because V # @. If P(G, A) has constant term a, then
                   P(G, 0) =a #0. This implies that there are a ways to color G properly with 0 colors, a
                   contradiction.

THEOREM 11.12      Let G = (V, FE) with |Z| > 0. Then the sum of the coefficients in P(G, ) is 0.
                   Proof: Since |E| > 1, we have x (G) > 2, so we cannot properly color G with only one color.
                   Consequently,       P(G.      1) = 0 = the sum of the coefficients in P(G, A).

Since the chromatic polynomial of a complete graph is easy to determine, an alternative
                   method for finding P(G, 4) can be obtained. Theorem 11.10 reduced the problem to smaller
                   graphs. Here we add edges to a given graph until we reach complete graphs.

THEOREM 11.13      Let G =(V, E), with a, b € V but {a, b} = e ¢ E. We write G? for the graph we obtain
                   from G by adding the edge e = {a, b}. Coalescing the vertices a and b in G gives us the
                   subgraph G3* of G. Under these circumstances P(G, 4) = P(G7, 4) + P(GI*, A).
                   Proof: This result follows as in Theorem 11.10 because P(G7, A) = P(G, 4) — P(G#t, A).
570         Chapter 11 An Introduction to Graph Theory

Let us now apply Theorem          11.13.
      EXAMPLE 11.38

b              d                    b              d               b (=a)         d

P(G,d)                             P(G¢, A)                      P(GE*, A)

Here P(G, A) = AM +49              =2(A — IA — 2), so x (G) = 3. In addition, if six colors
                             are available, the vertices in G can be properly colored in 6(5)(4)* = 480 ways.

Our next result again uses complete graphs — along with the following concepts.
                                For all graphs G;        = (V,, £,) and G2 = (V2, F2).

i)   the union of G; and G2, denoted G, U Gz, is the graph with vertex set V; U V2 and
                                      edge set FE, U Eo; and
                                ii) when       V, 1 V2 # @, the intersection of G; and G2, denoted G; M Go, is the graph
                                      with vertex set V; M V2 and edge set £, M Fo.

THEOREM     11.14            Let G be an undirected graph with subgraphs G,, G2. IfG = G; U G2 and G; M G2 = Ky,
                             for some n € Zt, then

P(G,A) = P(G,, A)wm+ P(G2, d)
                             Proof: Since G; 1 G2 = Ky, it follows that K,, is a subgraph of both G,; and G2 and that
                             x (G1), x(G2) = n. Given A colors, there are A                   proper colorings of K,,. For each of these
                             4 colorings there are P(G;, 4)/A™ ways to properly color the remaining vertices in G).
                             Likewise, there are P(G2, i) /A ways to properly color the remaining vertices in G2. By
                             the rule of product,
                                                                        P(Gi,A)          P(G2,4) _ P(G1,A)- P(G2, A)
                                         P(G, i) = P(Ky, A) -             1”               Qt)                   AM

Consider the graph in Example              I!.37. Let G; be the subgraph induced by the vertices
      EXAMPLE 11.39
                             w, x, y, z. Let G2 be the complete graph K3 — with vertices v, w, and x. Then G; M G                      is
                             the edge {w, x}, so G,) NG2        = K2.
                                Therefore

P(G.4) = P(G,,A)-
                                                                    (Gi,   A)- P(Go,
                                                                                  P(G2 4 _             AM 42)
                                                                                    A@                    2

— A         = 1)? — 2)? — 3)
                                                                                  ACA — 1)
                                                                        AA — 1) — 2)7(A — 3),
                             agreeing with the answer obtained in Example 11.37.
                                                                            11.6 Graph Coloring and Chromatic Polynomials           571

Much more can be said about chromatic polynomials — in particular, there are many
                                unanswered questions. For example, no one has found a set of conditions that indicate
                                whether a given polynomial in 4 is the chromatic polynomial for some graph. More about
                                this topic is introduced in the article by R. C. Read [25].

6. a) Consider the graph K23 shown in Fig. 11.91, and let
                          EXERCISES 11.6                                    4 € Z denote the number of colors available to properly
                                                                            color the vertices of Kz 3. (i) How many proper colorings
1. A pet-shop owner receives a shipment of tropical fish.
                                                                            of K23 have vertices a, b colored the same? (ii) How many
Among the different species in the shipment are certain pairs
                                                                            proper colorings of Ky3 have vertices a, b colored with
where one species feeds on the other. These pairs must conse-
                                                                            different colors?
quently be kept in different aquaria. Model this problem as a
graph-coloring problem, and tell how to determine the smallest              b) What is the chromatic polynomial for K 3? What is
number of aquaria needed to preserve all the fish in the ship-              X(K73)?
ment.                                                                       c) For    € Z*, what is the chromatic polynomial for K2,,?
                                                                           What is x (K2,,)?
  2. As the chair for church committees, Mrs. Blasi is faced with
scheduling the meeting times for 15 committees. Each commit-
tee meets for one hour each week. Two committees having a                                                        x
common member must be scheduled at different times. Model
                                                                                                  a
this problem as a graph-coloring problem, and tell how to de-
termine the least number of meeting times Mrs. Blasi has to
                                                                                                                 y
consider for scheduling the 15 committee meetings.

3. a) Atthe J. & J. Chemical Company, Jeannette has received                                     b
    three shipments that contain a total of seven different chem-                                                Zz
    icals. Furthermore, the nature of these chemicals is such                                     Figure 11.91
    that for all 1 <7 <5, chemical i cannot be stored in the
    same storage compartment as chemical 7 + 1 or chemical
    i +2, Determine the smallest number of separate storage
                                                                         7. Find the chromatic number of the following graphs.
    compartments that Jeannette will need to safely store these
    seven chemicals.                                                        a) The complete bipartite graphs Ky,,.
    b) Suppose that in addition to the conditions in part (a),              b) Acycle on # vertices, n > 3,
    the following four pairs of these same seven chemicals also             c) The graphs in Figs. 11.59(d), 11.62(a), and 11.85.
    require separate storage compartments: {| and 4, 2 and 5, 2            d) The n-cube Q,,n > 1,
    and 6, and 3 and 6. What is the smallest number of storage
    compartments that Jeannette now needs to safely store the         8. If G is a loop-free undirected graph with at least one edge,
    seven chemicals?                                                 prove that G is bipartite if an only if x (G) = 2.

4. Give anexample of an undirected graph G = (V, E), where              9. a) Determine the chromatic polynomials for the graphs in
x(G) = 3 but no subgraph of G is isomorphic to K3.                          Fig. 11.92
5. a) Determine   P(G,    A) for G = K)3.                                 b) Find x (G) for each graph.
    b) Forn € Z, what is the chromatic polynomial for K,,,,?               c) If five colors are available, in how many ways can the
    What is its chromatic number?                                          vertices of each graph be properly colored?

t

w               x
                                                                                 Ww

t

w   x      y    2           y           x                Z           y

(a)                         (b)                          (c)
                               Figure 11.92
572             Chapter 11 An Introduction to Graph Theory
                                                                                                   x1     XQ      XB         Xn-1    Xn
10. a) Determine whether the graphs in Fig. 11.93 are isomor-
    phic.
      b) Find P(G, i) for each graph.
      ¢) Comment on the results found in parts (a) and (b).

Yi     ¥2      ¥3         Yn-1    Yn

'                                                             Figure 11.94

15. For n > 3, let C, denote the cycle of length n.
                                                                                       a) What is P(C3, A)?
                     g                                 j                               b) If n > 4, show that

P(C,,4) = P(Pa-1, A) — P(Cn-1, 4),
                                     k                                                 where P,,_, denotes the path of length n — 1.
                                                                                       c) Verify that P(P,-), A) = A(A — 1)"7!", for all a > 2.
                                     u
                                                                                       d) Establish the relations
                                                                                    P(C,,A)
                                                                                        — A= 1)" = (A—                 19"! — P(Cn-1.
                                                                                                                                    4),           n> 4,

v                                 y                            P(Cy, A) — (A= 1)” = P(Cy-2. AV -— (A            1",
                                                                                       e) Prove that for all n > 3,

P(C,, A) = (A- 1)" + (-1)"QA— 1).
                                     Zz
                                                                                   16. Forn > 3, recall that the wheel graph, W,,, is obtained from
                                                                                   acycle of length n by placing a new vertex within the cycle and
                  Figure 11.93
                                                                                   adding edges (spokes) from this new vertex to each vertex of
                                                                                   the cycle.
11. For n > 3, let G, =(V, E) be the undirected graph ob-                              a) What relationship is there between x (C,,) and x(W,,)?
tained from the complete graph K,, upon deletion of one edge.
                                                                                       b) Use part (e) of Exercise 15 to show that
Determine P(G,, 4) and x(G,).
12. Consider the complete graph K,, for n > 3. Color r of the                                    P(W,,    A) = ACA — 2)" + (-1)"A(A       — 2).

vertices in K, red and the remaining    — r (= g) vertices green.                      c)    i) Ifwehave k different colors available, in how many
For any two vertices v, w in K,, color the edge {v, w} (1) red if                               ways can we paint the walls and ceiling of a pen-
v, w are both red; (2) green if v, w are both green; or (3) blue                                tagonal room if adjacent walls, and any wall and the
if v, w have different colors. Assume that r > g.                                               ceiling, are to be painted with different colors?
      a) Show    that for r = 6 and g = 3 (and n = 9) the total                             ii) What is the smallest value of k for which such a
      number of red and green edges in Ky equals the number of                                  coloring is possible?
      blue edges in Ko.                                                            17. Let G = (V, E) be a loop-free undirected graph with chro-
      b) Show that the total number of red and green edges in                      matic polynomial P(G, 4) and |V| = n. Use Theorem 11.13 to
      K,, equals the number of blue edges in K,, if and only if                    prove that P(G, A) has degree n and leading coefficient | (that
      n=r-+g, where g, r are consecutive triangular numbers.                       is, the coefficient of 2” is 1),
      [The triangular numbers are defined recursively by t) =
                                                                                   18. Let G = (V, FE) be a loop-free undirected graph.
      ltep=th+(a+1),n>1; sot, =n(n + 1)/2. Hence
      th=1,6f=3,4h        =6,....]                                                     a) For each such graph, where |V| <3, find P(G, 4) and
                                                                                       show that in it the terms contain consecutive powers of A.
13. Let G=(V, FE) be the                  undirected       connected     “ladder
                                                                                       Also show that the coefficients of these consecutive powers
graph” shown in Fig. 11.94.
                                                                                       alternate in sign.
      a) Determine |V| and ||.
                                                                                       b) Now     consider     G = (V, E),   where   |V| =n >4       and
      b) Prove that P(G, A) = A(A — 1)(A* — 3A +.3)"7)7                                |E| =k.    Prove   by mathematical    induction    that the terms
14. Let    G    be a loop-free   undirected            graph,    where      A =        in P(G, A) contain consecutive powers of 4 and that the
max,<y {deg(v)}. (a) Prove that x(G) < A+                     1. (b) Find two          coefficients of these consecutive powers alternate in sign.
types of graphs G, where x(G) = A+ 1.                                                  [For the induction hypothesis, assume that the result is
                                                                                11.7. Summary and Historical Review         573

true for all loop-free undirected graphs G = (V, £), where        b) Forn € Z*, n > 2, which of the complete graph K,, are
   either (i) |V| =” — 1 or Gi) |V| =x”, but |E| =k — 1.)            color-critical?
    c) Prove that if |V| =n, then the coefficient of 4”~! in          c) Prove that a color-critical graph must be connected.
    P(G, i) is the negative of |£|.                                  d) Prove that if G is color-critical with x(G) =k, then
19, Let G = (V, E) be a loop-free undirected graph. We call G        deg(v) > k — 1 forallve V.
color-critical if y(G) > x(G — v) forall ve V.
   a) Explain why cycles with an odd number of vertices are
   color-critical while cycles with an even number of vertices
   are not color-critical.

11.7
       Summary and Historical Review
                              Unlike other areas in mathematics, graph theory traces its beginnings to a definite time
                              and place: the problem of the seven bridges of K6nigsberg, which was solved in 1736 by
                              Leonhard Euler (1707-1783). And in 1752 we find Euler’s Theorem for planar graphs. (This
                              result was originally presented in terms of polyhedra.) However, after these developments,
                              little was accomplished in this area for almost a century.
                                   Then, in 1847, Gustav Kirchhoff (1824-1887) examined a special type of graph called
                              a tree. (A tree is a loop-free undirected graph that is connected but contains no cycles.)
                              Kirchhoff used this concept in applications dealing with electrical networks in his extension
                              of Ohm’s laws for electrical flow. Ten years later Arthur Cayley (1821-1895) developed
                              this same type of graph in order to count the distinct isomers of the saturated hydrocarbons
                              C, Hoy 42,   AE   Zz.
                                   This period also saw two other major ideas come to light. The four-color conjecture was
                              first investigated by Francis Guthrie (183 !—1899) in about 1850. In Section 11.6 we related
                              some of the history of this problem, which was solved via an intricate computer analysis in
                               1976 by Kenneth Appel and Wolfgang Haken.
                                   The second major idea was the Hamilton cycle. This cycle is named for Sir William
                              Rowan Hamilton (1805-1865), who used the idea in 1859 for an intriguing puzzle that used
                              the edges on a regular dodecahedron. A solution to this puzzle is not very difficult to find,
                              but mathematicians still search for necessary and sufficient conditions to characterize those
                              undirected graphs that possess a Hamilton path or cycle.
                                   Following these developments, we find little activity until after 1920. The characteriza-
                              tion of planar graphs was solved by the Polish mathematician Kasimir Kuratowski (1896—
                               1980) in 1930. In 1936 we find the publication of the first book on graph theory, written
                              by the Hungarian mathematician Dénes K6nig (1884-1944), a prominent researcher in the
                              field. Since then there has been a great deal of activity in the area, the computer providing
                              assistance in the last five decades. Among the many contemporary researchers (not men-
                              tioned in the chapter references) in this and related fields one finds the names Claude Berge,
                              V. Chvatal, Paul Erdés, Laszlo Lovasz, W. T. Tutte, and Hassler Whitney.
                                  Comparable coverage of the material presented in this chapter is contained in Chapters
                              6, 8, and 9 of C. L. Liu [23]. More advanced work is found in the works by J. A. Bondy
                              and U.S. R. Murty [10], N. Hartsfield and G. Ringel [20], and D. B. West [32]. The book
                              by F. Buckley and F. Harary [11] revises the classic work of F. Harary [18] and brings the
                              reader up to date on the topics covered in the original 1969 work. The text by G. Chartrand
                              and L. Lesniak [12] provides a more algorithmic approach in its presentation. A proof of
574   Chapter 11    An Introduction to Graph Theory

William Rowan Hamilton (1805-1865)                                                     Paul Erdés (1913-1996)
      Reproduced courtesy of The Granger Collection, New York                              Reproduced courtesy of Christopher Barker

Kuratowski’s Theorem appears in Chapter 8 of C. L. Liu [23] and Chapter 6 of D. B. West
                           [32]. The article by G. Chartrand and R. J. Wilson [13] develops many important concepts in
                           graph theory by focusing on one particular graph   — the Petersen graph. This graph (which
                           we mentioned in Section | 1.4) is named for the Danish mathematician Julius Peter Christian
                           Petersen (1839-1910), who discussed the graph in a paper in 1898.
                               Applications of graph theory in electrical networks can be found in S. Seshu and M. B.
                           Reed [30]. In the text by N. Deo [14], applications in coding theory, electrical networks, op-
                           erations research, computer programming, and chemistry occupy Chapters 12—15. The text
                           by F. S. Roberts [26] applies the methods of graph theory to the social sciences. Applications
                           of graph theory in chemistry are given in the article by D. H. Rouvray [29].
                               More on chromatic polynomials can be found in the survey article by R. C. Read [25].
                           The role of Polya’s theory” in graphical enumeration is examined in Chapter 10 of N. Deo
                           [14]. A thorough coverage of this topic is found in the text by F. Harary and E. M. Palmer
                           [19].
                               Additional coverage on the historical development of graph theory is given in N. Biggs,
                           E. K. Lloyd, and R. J. Wilson [9].
                               Many applications in graph theory involve large graphs that require the computationally
                           intensive talents of a computer in conjunction with the ingenuity of mathematical methods.
                           Chapter 11 of N. Deo [14] presents computer algorithms dealing with several of the graph-
                           theoretic properties we have studied here. Along the same line, the text by A. V. Aho, J. E.
                           Hopcroft, and J. D. Ullman [1] provides even more for the reader interested in computer
                           science,
                               As mentioned at the end of Section 11.5, the traveling salesman problem is closely related
                           to the search for a Hamilton cycle in a graph. This is a graph-theoretic problem of interest
                           in both operations research and computer science. The article by M. Bellmore and G. L.

"We shall introduce the basic ideas behind this method of enumeration in Chapter 16.
                                                                                            References           575

Nemhauser [8] provides a good introductory survey of results on this problem. The text
     by R. Bellman,       K. L. Cooke,        and J. A. Lockett     [7] includes an algorithmic treatment of
     this problem along with other graph problems. A number of heuristics for obtaining an
     approximate solution to the problem are given in Chapter 4 of the text by L. R. Foulds [17].
     The text edited by E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and D. B. Shmoys
     [22] contains 12 papers dealing with various aspects of this problem, including historical
     considerations as well as some results on computational complexity. Applications, where a
     robot visits different locations in an automated warehouse in order to fill a given order, are
     examined in the articles by E. A. Elsayed [15] and by E. A. Elsayed and R. G. Stern [16].
         The solution of the four-color problem can be examined further by starting with the
     paper by K. Appel and W. Haken [3]. The problem, together with its history and solution, is
     examined in the text by D. Barnette [6] and in the Scientific American article by K. Appel
     and W. Haken [4]. The proof uses a computer analysis to handle a large number of cases; the
     article by T. Tymoczko [31] examines the role of such techniques in pure mathematics. In
     [5] K. Appel and W. Haken further examine their proof in the light of the computer analysis
     that was used. The articles by N. Robertson, D. P. Sanders, P. D. Seymour, and R. Thomas
     [27, 28] provide a simplified proof. In 1997 their computer code was made available on the
     Internet. This code could prove the four-color problem on a desktop workstation in roughly
     three hours.
        Finally,     the article by A. Ralston        [24] demonstrates      some      of the connections     among
     coding theory, combinatorics, graph theory, and computer science.

REFERENCES
         1. Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffrey D. Data Structures and Algorithms.
            Reading, Mass.: Addison-Wesley, 1983.
         2. Ahuja, Ravindra K., Magnanti, Thomas L., Orlin, James B., and Reddy, M. R.“Applications
            of Network Optimization.” In M. O. Ball, Thomas L. Magnanti, C. L. Monma, and G. L.
            Nemhauser, eds., Handbooks in Operations Research and Management Science, Vol. 7, Net-
            work Models. Amsterdam, Holland: Elsevier, 1995, pp. 1-83.
         3. Appel, Kenneth, and Haken, Wolfgang. “Every Planar Map Is Four Colorable.” Bulletin of the
            American Mathematical Society 82 (1976): pp. 711-712.
         4. Appel, Kenneth, and Haken, Wolfgang.“‘The Solution of the Four-Color-Map Problem. ” Sci-
            entific American 237 (October 1977): pp. 108-121.
         5. Appel, Kenneth, and Haken, Wolfgang.“The Four Color Proof Suffices.” Mathematical Intel-
            ligencer 8, no. 1 (1986): pp. 10- 20.
         6. Barnette, David. Map        Coloring, Polyhedra,    and the Four-Color Problem. Washington, D.C.:
            The Mathematical Association of America, 1983.
         7. Bellman, R., Cooke, K. L., and Lockett, J. A. Algorithms, Graphs, and Computers. New York:
            Academic Press, 1970.
         8. Bellmore, M., and Nemhauser, G. L.“The Traveling Salesman Problem: A Survey.” Operations
            Research 16 (1968): pp. 538-558.
         9. Biggs,    N., Lloyd,   E.   K.,   and Wilson,   R. J. Graph   Theory    (1736-1936).   Oxford,   England:
            Clarendon Press, 1976.
        10. Bondy, J. A., and Murty, U.S. R. Graph Theory with Applications. New York: Elsevier North-
            Holland, 1976.
        11. Buckley, Fred, and Harary, Frank. Distance in Graphs. Reading, Mass.: Addison-Wesley, 1990.
        12. Chartrand, Gary, and Lesniak. Linda. Graphs and Digraphs, 3rd ed. Boca Raton, Fla.: CRC
            Press, 1996,
        13. Chartrand, Gary, and Wilson, Robin J.“The Petersen Graph.” In Frank Harary and John S.
             Maybee, eds., Graphs and Applications. New York: Wiley,               1985.
576          Chapter 11   An Introduction to Graph Theory

14. Deo, Narsingh. Graph Theory with Applications to Engineering and Computer Science. En-
                                       glewood Cliffs, N. J.: Prentice-Hall, 1974.
                                   15. Elsayed, E. A.““Algorithms for Optimal Material Handling in Automatic Warehousing Sys-
                                       tems.” Jnt. J. Prod. Res. 19 (1981): pp. 525-535.
                                   16. Elsayed, E. A., and Stern, R. G.“Computerized Algorithms for Order Processing in Automated
                                       Warehousing Systems.” /nt. J. Prod. Res. 21 (1983): pp. 579-586.
                                   17. Foulds, L. R. Combinatorial Optimization for Undergraduates. New York: Springer-Verlag,
                                       1984,
                                   18. Harary, Frank. Graph Theory. Reading, Mass.: Addison-Wesley, 1969.
                                   19. Harary, Frank, and Palmer, Edgar M. Graphical Enumeration. New York: Academic Press,
                                       1973.
                                   20. Hartsfield, Nora, and Ringel, Gerhard. Pearls in Graph Theory: A Comprehensive Introduction.
                                       Boston, Mass.: Harcourt/Academic Press, 1994.
                                   21. Jiinger, M., Reinelt, G., and Rinaldi, G.““‘The Traveling Salesman Problem.” In M. O. Ball,
                                       Thomas    L. Magnanti,   C. L. Monma,   and G. L. Nemhauser,   eds., Handbooks   in Operations
                                       Research and Management Science, Vol. 7, Network Models. Amsterdam, Holland: Elsevier,
                                       1995, pp. 225-330.
                                   22. Lawler, E. L., Lenstra, J. K., Rinnooy Kan, A. H. G., and Shmoys, D. B., eds. The Traveling
                                       Salesman Problem. New York: Wiley, 1986.
                                   23. Liu, C. L. /ntroduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
                                   24. Ralston, Anthony. “De Bruijn Sequences —A Model Example of the Interaction of Discrete
                                       Mathematics and Computer Science.” Mathematics Magazine 55, no. 3 (May 1982): pp. 131-
                                       143.
                                   25. Read, R. C.““An Introduction to Chromatic Polynomials.” Journal of Combinatorial Theory 4
                                       (1968): pp. 52-71.
                                   26. Roberts, Fred S. Discrete Mathematical Models. Englewood Cliffs, N. J.: Prentice-Hall, 1976.
                                   27. Robertson, N., Sanders, D. P., Seymour, P. D., and Thomas, R. “Efficiently Four-coloring
                                       Planar Graphs.” Proceedings of the 28th ACM Symposium on the Theory of Computation.
                                       ACM Press (1996): pp. 571-575.
                                   28. Robertson, N., Sanders, D. P., Seymour, P. D., and Thomas, R. “The Four-color Theorem.”
                                       Journal of Combinatorial Theory Series B70 (1997): pp. 166-183.
                                   29. Rouvray, Dennis H. “Predicting Chemistry from Topology.” Scientific American 255, no. 3
                                       (September 1986): pp. 40-47.
                                   30. Seshu, S., and Reed, M. B. Linear Graphs and Electrical Networks. Reading, Mass.: Addison-
                                       Wesley, 1961.
                                   31. Tymoczko, Thomas. “Computers, Proofs and Mathematicians: A Philosophical Investigation
                                       of the Four-Color Proof.” Mathematics Magazine 53, no. 3 (May 1980): pp. 131-138.
                                   32. West, Douglas B. Introduction to Graph Theory, 2nd ed. Upper Saddle River, N.J.: Prentice-
                                       Hall, 2001.

b) Prove that in any group of six people there must be three
          SUPPLEMENTARY EXERCISES                                         who are total strangers to one another or three who are mu-
                                                                          tual friends.
                                                                        4. a) LetG =(V, £) bea loop-free undirected graph. Recall
  1. Let G be a loop-free undirected graph on n vertices. If G
                                                                           that G is called self-complementary if G and G are iso-
has 56 edges and G has 80 edges, what is n?
                                                                           morphic. If G is self-complementary (i) determine |£| if
2. Determine the number of cycles of length 4 in the hyper-               |V| =n; (ii) prove that G is connected.
cube Q,,.                                                                 b) Let n€Z*,      where n= 4k      (KE Z*)    or n=4k +1
                                                                          (k € N). Prove that there exists a self-complementary graph
3. a) If the edges of K¢ are painted either red or blue, prove
                                                                          G =(V, E). where |V| =n.
    that there is a red triangle or a blue triangle that is a sub-
    graph.
                                                                                                                          Supplementary Exercises              577

5. a) Show that the graphs G, and Go, in Fig. 11.95, are iso-                         b) Verify that |V| is the sum of the independence number
    morphic.                                                                           of G (as defined in Exercise 25 for Section 11.5) and its
     b) How many       different isomorphisms           f:G, >        Gz     are       covering number.
     possible here?                                                                10. If G = (V, £) is an undirected graph, a subset D of V is
                                                                                   called a dominating set if for all v € V, either v € D or v is
                                                                                   adjacent to a vertex in D. If D is adominating set and no proper
                                                   u                               subset of D has this property, then D is called minimal. The size
        ]       2       3                                                          of any smallest dominating set in G is denoted by y(G) and is
                                                                                   called the domination number of G.
                                   z                              v
                                                                                       a) If G has no isolated vertices, prove that if D is a minimal
                                                                                       dominating set, then V — D is a dominating set.
                                   y                              w
                                                                                       b) If J C V is independent, prove that / is a dominating
                                                                                       set if and only if J is maximal independent.
        4       5       6
                                                   x                                   c) Show that y(G) < B(G), and that |V| < 6(G)x(G).
      (G))                     (Gp)                                                    [Here 6(G) is the independence number of G — first given
     Figure 11.95                                                                      in Exercise 25 of Section 11.5.]
                                                                                   ll. Let G=(V,£) be the undirected connected “ladder
6. Are any of the planar graphs for the five Platonic solids                      graph” shown in Fig. 11.94. Forn > 0, let a, denote the number
bipartite?                                                                         of ways one can select n of the edges in G so that no two edges
                                                                                   share a common vertex. Find and solve a recurrence relation
7. a) How many paths of length 5 are there in the com-                            for a,.
    plete bipartite graph K3.7? (Remember that a path such
     as Uy >   U2 >   U3 —> U4 >       Us >   Ve Is Considered to be the           12. Consider the four comb graphs in parts (i), (ii), (iii), and
     same as the path vg >    vs > v4 >           03 > U2 > V1.)                   (iv) of Fig. 11.96. These graphs have 1 tooth, 2 teeth, 3 teeth,
                                                                                   and n teeth, respectively. For n > 1, let a, count the number of
     b) How many paths of length 4 are there in K37?
                                                                                   independent subsets in {x;, x2, .... Xn. Vis Y2.---+ Yn}. Find
     c) Let m,n, p €Z* with 2m <n and 1 < p< 2m. How                               and solve a recurrence relation for a,.
     many paths of length p are there in the complete bipartite
     graph K,,.,?
  8. LetX = {1, 2, 3,...,”}, wheren > 2. Construct the loop-
                                                                                                      x)             X,        Xp      Xy      XQ
free undirected graph G = (V, EF) as follows:
e (V): Each two-element subset of X determines a vertex
   of G.
©   (E): If vy}, v2 € V correspond to subsets {a, b} and {c, d},
     respectively, of X, draw          the edge     {v,, v2}   in G        when
                                                                                                      y1             yi        2       ¥1      Yor        ¥3
     {a, b} N {e, d} = &.
                                                                                              (1)                 (1i)              (iil)
     a) Show that G is an isolated vertex when n = 2 and that
     G is disconnected for n = 3, 4.                                                                   x     x2           X3                Xn-1     Xn

b) Show that for n > 5, G is connected. (In fact, for all
     v,, v2 € V, either {v), v2} € E or there is a path of length 2
     connecting v; and v2.)
     c) Prove that G is nonplanar for n > 5.                                                           ¥y,   2            3                 Yn-1     Yn
     d) Prove that for n > 8, G has a Hamilton cycle.                                          (iv)
9. If G = (V, E) is an undirected graph, a subset K of V is                                  Figure 11.96
called a covering of G if for every edge {a, b} of G either a or
bisin K. The set K is a minimal covering if K — {x} fails to
cover G for each x € K. The number of vertices in a smallest                       13. Consider the four graphs in parts (i), (ii), (iii), and (iv) of
covering is called the covering number of G.                                       Fig. 11.97. If a, counts the number of independent subsets of
     a) Prove that if / C V, then / is an independent set in G if                  {X], X20, 0665 Xne Vis V2v eee yn}, Wheren > 1, find and solve a
     and only if V — / is a covering of G.                                         recurrence relation for a@,.
578                Chapter 11 An Introduction to Graph Theory

where we join two vertices e;, e2 in L(G) if and only if e,, e2
                    x)              XxX,        X2     xX,      XQ        3
                                                                                             are adjacent edges in G.
                                                                                                 a) Find L(G) for each of the graphs in Fig. 11.99.
                                                                                                 b) Assuming that |V| =                and |E| = e, show that L(G)
                                                                                                 has e vertices            and   (1/2) }°,.y deg(v)[deg(v) — 1] =

(I)
                    y4
                             (in)
                                    Yi          2      ¥i
                                                     (in)
                                                                Y2        $3
                                                                                                 [(1/2) Prevideg@)P1 —e = Dev (“8”) edges.
                     xy     x2             x3                Xn-1    Xn

¥1     2              ¥3                Yn-1    Yn

(iv)
           Figure 11.97                                                                                        a                 b           WwW                  x
                                                                                                         (a)                               (b)
                                                                                                         Figure 11.99
14. For n > 1, let a, = (5), the number of edges in K,, and let
ay = 0. Find the generating function f(x) = ))™5 a,x".
15. For the graph G in Fig. 11.98, answer the following ques-                                    c) Prove that if G has an Euler circuit, then L(G) has both
tions.                                                                                           an Euler circuit and a Hamilton cycle.
      a) What are y(G), 8(G), and x(G)?                                                          d) If G = Ky, examine L(G) to show that the converse of
      b) Does G have an Euler circuit or a Hamilton cycle?                                       part (c) is false.
      c) Is G bipartite? Is it planar?                                                           e) Prove that ifG has a Hamilton cycle, then so does L(G).
                                                                                                 f)} Examine L(G) for the graph in Fig. 11.99(b) to show
                                                                                                 that the converse of part (e) is false.
                                                                                                 g) Verify that L(G) isnonplanarforG = KsandG                              = K33.
                                                                                                 h) Give an example of a graph G, where G is planar but
                                                                                                 L(G) is not.
                                                                                             19, Explain why each of the following polynomials in 4 cannot
                                                                                             be a chromatic polynomial.
                                                                                                 a) At — 54° 4+ 747-6043
                          Figure 11.98                                                           b) 34° - 447 +A
                                                                                                 c) At — 343 +52? - 42
16. a) Suppose that the complete bipartite graph                               K,,,,, con-   20. a) For all x, y € Z*. prove that x+y — xy? is even.
      tains 16 edges and satisfies m <n. Determine m, 7 so that
                                                                                                 b) Let V = {1, 2, 3,..., 8,9}. Construct the loop-free
      Km» possesses (i) an Euler circuit but not a Hamilton cycle;
                                                                                                 undirected graph G = (V, E) as follows: For m, n€V,
      (ii) both a Hamilton cycle and an Euler circuit.
                                                                                                 m #n, draw the edge {m,n} in G if 5 divides m +n or
      b) Generalize the results of part (a).                                                     m—      fh.

17. If G = (V, E) is an undirected graph, any subgraph of G                                       c) Given any three distinct positive integers, prove that
that is a complete graph is called a clique in G. The number of                                  there     are     two   of these,   say     x     and   y,   where   10   divides
vertices in a largest clique in G is called the clique number for                                xy — xy,
G and is denoted by w(G).
                                                                                             21. a) For n>           1, let P,_; denote the path made up of n ver-
      a) How are x(G) and w(G) related?                                                          tices and          — 1 edges. Let a, be the number of independent
      b) Is there any relationship between w(G) and B(G)?                                        subsets of vertices in P,_,;. (The empty subset is consid-
18. If G = (V, E) is an undirected loop-free graph, the line                                     ered one of these independent subsets.) Find and solve a
graph of G, denoted L(G), is a graph with the set E as vertices,                                 recurrence relation for a,,.
                                                                                         Supplementary Exercises       579
b) Determine the number of independent subsets (of ver-
tices) in each of the graphs G,, G2, and G3, of Fig. 11.100.
c) For each of the graphs H;, Hz, and 3, of Fig. 11.101,
find the number of independent subsets of vertices.
d) Let G = (V, E) be a loop-free undirected graph with
V = {v,, v2,..., u,} and where there are m independent
subsets of vertices. The graph G’ = (V’, E’) is constructed
from G as follows: V’ = V U {x;, x2, ...,x;}, with no x,
in V, for all 1 <i <5; and E’ = EU {{x;, v;}|1 <i <s,
| < j <r}. How many subsets of V’ are independent?

1            i                   1

3            2                   2

5                 6       3    n+     4
                                                                     (H3)
                3
                             4                   n-]
                                                                   Figure 11.101
                4            5
(G,)               (G>)          (Gs)           ”
                                                               22. Suppose that G = (V, E) is a loop-free undirected graph.
Figure 11.100                                                  If G is 5-regular and |V| = 10, prove that G is nonplanar.
                   12
                   Trees

Cumre           our study of graph theory, we shall now focus on a special type of graph called
                           a tree. First used in 1847 by Gustav Kirchhoff (1824-1887) in his work on electrical
                       networks, trees were later redeveloped and named by Arthur Cayley (1821-1895). In 1857
                       Cayley used these special graphs in order to enumerate the different isomers of the saturated
                       hydrocarbons C,,H2,42,n € ZT.
                          With the advent of digital computers, many new applications were found for trees. Special
                       types of trees are prominent in the study of data structures, sorting, and coding theory, and
                       in the solution of certain optimization problems.

12.1
Definitions, Properties, and Examples

Definition 12.1       Let G = (V, E) be a loop-free undirected graph. The graph G                         is called a tree’ if G is
                       connected and contains no cycles.

In Fig. 12.1 the graph G, is a tree, but the graph G2 is not a tree because it contains the
                       cycle {a. b}, {b. c}, {c, a}. The graph G3 is not connected, so it cannot be a tree. However,
                       each component of G3 is a tree, and in this case we call G3 a forest.

a         5              a         b              ae

C                        c                        Cc

d                        d                      d

e                        e                         e

f                        f                       f
                                            (G))                     (Gp)                     (G3)
                                          Figure 12.1

TAs in the case of graphs, the terminology in the study of trees is not standard and the reader may find some
                       differences from one textbook to another,

581
582      Chapter 12 Trees

When a   graph is a tree we write T instead of G to emphasize this structure.
                               In Fig. 12.1 we see that G; is asubgraph of G2 where G, contains all the vertices of G2
                            and G, is a tree. In this situation G, is a spanning tree for G2. Hence a spanning tree for
                            a connected graph is a spanning subgraph that is also a tree. We may think of a spanning
                            tree as providing minimal connectivity for the graph and as a minimal skeletal framework
                            holding the vertices together. The graph G3 provides a spanning forest for the graph G2.
                                We now examine some properties of trees.

THEOREM 12.1                If a, b are distinct vertices in a tree T = (V, E), then there is a unique path that connects
                            these vertices.
                            Proof: Since 7 is connected, there is at least one path in 7 that connects a and b. If there
                            were more, then from two such paths some of the edges would form a cycle. But 7 has no
                            cycles.

THEOREM 12.2                If G = (V, E) is an undirected graph, then G is connected if and only if G has a spanning
                            tree.
                            Proof: If G has a spanning tree 7, then for every pair a, b of distinct vertices in V a subset of
                            the edges in 7 provides a (unique) path between a and b, and so G is connected. Conversely,
                            if G is connected and G is not a tree, remove all loops from G. If the resulting subgraph G,
                            is not a tree, then G;     must contain a cycle C;. Remove   an edge e,; from C, and let G2 =
                            G, — e;. If G2 contains no cycles, then G2 is a spanning tree for G because G2 contains
                            all the vertices in G, is loop-free, and is connected. If Gz does contain a cycle — say, Cz —
                            then remove an edge e> from C2 and consider the subgraph G3 = G2 — e2 = G, — {e}, e2}.
                            Once again, if G3 contains no cycles, then we have a spanning tree for G. Otherwise we
                            continue this procedure a finite number of additional times until we arrive at a spanning
                            subgraph of G that is loop-free and connected and contains no cycles (and, consequently,
                            is a spanning tree for G).

Figure 12.2 shows three nonisomorphic trees that exist for five vertices. Although they
                            are not isomorphic, they all have the same number of edges, namely, four. This leads us to
                            the following general result.

(7)                         (T>)                 (73)

Figure 12.2

THEOREM 12.3                In every tree T = (V, E),|V|       =|E|   +1.
                            Proof: The proof is obtained by applying the alternative form of the Principle of Mathe-
                            matical Induction to | Z|. If |£| = 0, then the tree consists of a single isolated vertex, as in
                                                                 12.1   Definitions, Properties, and Examples   583

Fig. 12.3(a). Here |V| = 1 = |£| + 1. Parts (b) and (c) of the figure verify the result for the
                  cases where |E| = 1 or 2.

e      /

(a)          (b)           (c)
                  Figure 12.3                                                        Figure 12.4

Assume the theorem is true for every tree that contains at most & edges, where k > 0.
                  Now consider a tree T = (V, £), as in Fig. 12.4, where |E| = k + 1. [The dotted edge(s)
                  indicates that some of the tree doesn’t appear in the figure.] If, for instance, the edge with
                  endpoints y, zis removed from 7, we obtain two subtrees, T; = (V,, E,) and T> = (V2, E>),
                  where |V| = |Vi| +|V2| and |F,| +|£2| +1 = |E|. (One of these subtrees could con-
                  sist of just a single vertex if, for example, the edge with endpoints w, x were removed.)
                  Since O<|F,)|<k and O0<|E,|<k, it follows, by the induction hypothesis, that
                  |E;| + 1 = |V,|, fori = 1, 2. Consequently, |V| = |Vi] +|V2| = (Ai] + 1)4+ (£2 +) =
                  (\Ei,| + |£2) +1) + 1 = |£| + 1, and the theorem follows by the alternative form of the
                  Principle of Mathematical Induction.

AS we examine the trees in Fig. 12.2 we also see that each tree has at least two pendant
                  vertices — that 1s, vertices of degree   |. This is also true in general.

THEOREM 12.4      For every tree T = (V, EF), if |V| > 2, then 7 has at least two pendant vertices.
                   Proof: Let |V| = n > 2. From Theorem 12.3 we know that |E| = n — 1, so by Theorem 11.2
                  it follows that 2(n — 1) = 2|E| =      nev deg(v). Since T is connected, we have deg(v) > 1
                  for all v € V. If there are k pendant vertices in 7, then each of the other n — k vertices has
                  degree at least 2 and

2(n — 1) = 2|E| = }° deg(v) = k + 2(n —k).
                                                               veV

From this we see that [2(7 -1)>k+2(n —k)|] > [Qn — 2) > (kK +2n —2k)] >
                  [--2 > —k] => [k => 2], and the result is consequently established.

In Fig. 12.5 we have two trees, each with 14 vertices (labeled with C’s and H’s) and 13
   EXAMPLE 12.1
                  edges. Each vertex has degree 4 (C, carbon atom) or degree | (H, hydrogen atom). Part (b) of
                  the figure has a carbon atom (C) at the center of the tree. This carbon atom is adjacent to four
                  vertices, three of which have degree 4. There is no vertex (C atom) in part (a) that possesses
                  this property, so the two trees are not isomorphic. They serve as models for the two chemical
584         Chapter 12 Trees

isomers that correspond with the saturated” hydrocarbon C4H                jo. Part (a) represents n-butane
                               (formerly called butane); part (b) represents 2-methyl propane (formerly called isobutane).

H

|
                                                           H—C-—-—H                               H
                                                                  |                               |
                                                           H—-C—H                     H    H—-C—H             H
                                                                  |                    |          |           |
                                                           H—-C—H              H—C                C           C—H
                                                                  |                    |              |       |
                                                           H—C—H                      H           H           H
                                                                  |
                                                                  H
                                                     (a)                     (b)
                                                    Figure 12.5

A second result from chemistry is given in the following example.

If a saturated hydrocarbon [in particular, an acyclic (no cycles), single-bond hydrocarbon —
      EXAMPLE 12.2
                               called an alkane| has n carbon atoms, show that it has 2n + 2 hydrogen atoms.
                                   Considering the saturated hydrocarbon as a tree T = (V, E), let k equal the number of
                               pendant vertices, or hydrogen atoms, in the tree. Then with a total of n + k vertices, where
                               each of the n carbon atoms has degree 4, we find that

4n +k = S| deg(v) = 2|E| = 2(\V] - 1) =24+k—1),
                                                              veV

and

4n+k=2(n+k-1)
                                                                          3k =2n42.

We close this section with a theorem that provides several different ways to characterize
                               trees.

THEOREM 12.5                   The following statements are equivalent for a loop-free undirected graph G = (V, E).

a) G is a tree.
                                 b) G is connected, but the removal of any edge from G disconnects G into two subgraphs
                                    that are trees.
                                  c) G contains no cycles, and |V| = |E| + 1.
                                 d) G is connected, and |V| = |F| + 1.

The adjective saturated is used here to indicate that for the number of carbon atoms present in the molecule,
                               we have the maximum number of hydrogen atoms.
                                                                                                   12.1.   Definitions, Properties, and Examples   585

e) G contains no cycles, and if a,b € V with {a, b} ¢ E, then the graph obtained by
                                               adding edge {a, b} to G has precisely one cycle.

Proof: We shall prove that (a) = (b), (b) => (c), and (c) => (d), leaving to the reader the
                                          proofs for (d) = (e) and (e) => (a).

[(a) > (b)]: If G is a tree, then G is connected. So let e = {a, b} be any edge of G.
                                             Then if G — e is connected, there are at least two paths in G from a to b. But this
                                             contradicts Theorem 12.1. Hence G — e is disconnected and so the vertices in G — e
                                             may be partitioned into two subsets: (1) vertex @ and those vertices that can be reached
                                              from a by a path in G — e; and (2) vertex b and those vertices that can be reached from
                                             b by a path in G — e. These two connected components are trees because a loop or cycle
                                             in either component would also be in G.
                                             [(b) = (c)]: If G contains a cycle, then let e = {a, b} be an edge of the cycle. But
                                             then G — e is connected, contradicting the hypothesis in part (b). So G contains no
                                             cycles, and since G is a loop-free connected undirected graph, we know that G is a tree.
                                             Consequently, it follows from Theorem                  12.3 that |V| = || + 1.
                                                 [(c) > (d)]: Let «(G) =r and let G;, G2, ..., G, be the components of G. For 1 <
                                              i <r, select a vertex v; € G; and add the r — 1 edges {v1, v2}, {v2, v3}, ..., {v--1, v,}
                                              to G to form the graph G’ = (V, EF’), which is a tree. Since G’ is a tree, we know that
                                                 |V| = |E’| + 1 because of Theorem 12.3. But from part (c), |V| = |E| + 1,so|E| = |E"|
                                              andr      — 1 = 0. With r = 1, it follows that G is connected.

b) Ifatree T = (V, E) has v2 vertices of degree 2, v3 ver-
                               EXERCISES 12.1                                               tices of degree 3,..., and v,, vertices of degree m, what
                                                                                            are |V| and |E|?
1. a) Draw      the graphs       of all nonisomorphic               trees on     six
                                                                                          9, If G = (V, E) is a loop-free undirected graph, prove that
    vertices.
                                                                                        G is a tree if there is a unique path between any two vertices
      b) How many isomers does hexane (C¢H 4) have?                                     of G.
2.   Let   T=   (V1,   Fi),     T>   =   (V2,    E>)   be   two     trees   where
                                                                                        10. The connected undirected graph G = (V, E) has 30 edges.
|E,;| = 17 and |V2| = 2|V,|. Determine |Vj|, |V2|, and | E>}.
                                                                                        What is the maximum value that |V| can have?
3. a) Let Fi = (Vi, E,) be a forest of seven                        trees where
                                                                                        11, Let 7 = (V, E) beatree with |V| = n > 2. How many dis-
    |E,| = 40. What is | V,|?
                                                                                        tinct paths are there (as subgraphs) in 7?
      b) If Fy = (V2, E2) is a forest with | V2| = 62 and |Z] =
                                                                                        12. Let G =(V, E) be a loop-free connected undirected
      51, how many trees determine Fy?
                                                                                        graph where V = {v), v2, V3, ..., Un}, > 2, deg(v,) = 1, and
4. IfG = (V, E) isa forest with |V| = v, |Z| = e,and« com-                             deg(v,) > 2 for 2 <i <n. Prove that G must have a cycle.
ponents (trees), what relationship exists among                    v, e, and «?
                                                                                        13. Find two nonisomorphic spanning trees for the complete
5, What kind of trees have exactly two pendant vertices?                               bipartite graph K2,;. How many nonisomorphic spanning trees
6. a) Verify that all trees are planar.                                                are there for K> 3?

b) Derive Theorem 12.3 from part (a) and Euler’s Theorem                          14. For n € Z*, how many nonisomorphic spanning trees are
      for planar graphs.                                                                there for K>,,?

7. Give an example of an undirected graph G = (V, E) where                             15. Determine the number of nonidentica] (though some may
|V| = |E| + 1 but G is nota tree.                                                       be isomorphic) spanning trees that exist for each of the graphs
                                                                                        shown in Fig. 12.6.
8. a) If a tree has four vertices of degree 2, one vertex of de-
    gree 3, two of degree 4, and one of degree 5, how many                              16. For each graph in Fig. 12.7, determine how many noniden-
    pendant vertices does it have?                                                      tical (though some may be isomorphic) spanning trees exist.
586          Chapter 12 Trees

a) What is the smallest value possible for n?
                                                                     b) Prove that 7 has at least m pendant vertices.
                                                                 18. Suppose that T = (V, E) is a tree with |V| = 1000. What
                                                                 is the sum of the degrees of all the vertices in T?
                                                                 19, Let G = (V, E) bea loop-free connected undirected graph.
                                                                 Let H be a subgraph of G. The complement of H in G is the
                                                                 subgraph of G made up of those edges in G that are not in H
           (1)
                                                                 (along with the vertices incident to these edges).
                                                                     a) If T is a spanning tree of G, prove that the complement
                                                                     of T in G does not contain a cut-set of G.
                                                                     b) If C is a cut-set of G, prove that the complement of C
                                                                     in G does not contain a spanning tree of G.
                                                                 20. Complete the proof of Theorem 12.5.
                                                                 21. A labeled tree is one wherein the vertices are labeled. If the
                                                                 tree has n vertices, then {1, 2, 3,..., 2} is used as the set of
                                                                 labels. We find that two trees that are isomorphic without labels
                                                                 may become nonisomorphic when labeled. In Fig. 12.8, the first
          Figure 12.6
                                                                 two trees are isomorphic as labeled trees. The third tree is iso-
                                                                 morphic to the other two if we ignore the labels; as a labeled
                                                                 tree, however, it is not isomorphic to either of the other two.

(ii)

(2)

(iii)

Figure 12.8

(3)           °                                    The number of nonisomorphic trees with n labeled ver-
                                                                 tices can be counted by setting up a one-to-one correspon-
                 Figure 12.7                                     dence between these trees and the n"~? sequences (with repe-
                                                                 titions allowed) x), X2, ..., X,-2 whose entries are taken from
17. Let T = (V, E) bea tree where |V| = n. Suppose that for     {1, 2,3,...,n}. If T is one such labeled tree, we use the fol-
each v € V, deg(v) = | or deg(v) > m, where m is a fixed pos-   lowing algorithm to find its corresponding sequence     — called
itive integer and m > 2.                                        the Priifer code for the tree. (Here T has at least one edge.)
                                                                                                            12.2   Rooted Trees        587

Step 1: Set the counteri to lL.                                       23. Characterize the trees whose Priifer codes
   Step 2: Set 7)    = T.                                                     a) contain only one integer, or
   Step 3: Since a tree has at least two pendant vertices, select             b) have distinct integers in all positions.
   the pendant vertex in 7 (i) with the smallest label y,. Now          24.   Show   that the number    of labeled trees with n vertices, k
   remove the edge {x,, y,} from 7 (i) and use x, for the ith           of which are pendant vertices, is ({)(n — k)!S(n — 2, n — k) =
   component of the sequence.                                           (n!/k!)S(n —2,n —k), where S(n —2,n—k) is a Stirling
  Step 4: If i =n — 2, we have the sequence corresponding               number of the second kind. (This result was first established
  to the given labeled tree 7 (1). Ifi A n — 2, increase i by 1,        in 1959 by A. Rényi.)
  set T (i) equal to the resulting subtree obtained in step (3),        25. Let G = (V, E) be the undirected graph in Fig. 12.9. Show
  and return to step (3).                                               that the edge set & can be partitioned as E, U E> so that the sub-
    a) Find the six-digit sequence (Priifer code) for trees (i)         graphs G; = (V, E,), G2 = (V, E>) are isomorphic spanning
    and (iii) in Fig. 12.8.                                             trees of G.
   b) If v is a vertex in 7, show that the number of times the
   label on v appears in the Priifer code x), x2, ..., X,—2 iS
   deg(v) — 1.
    c) Reconstruct the labeled tree on eight vertices that is as-
    sociated with the Priifer code 2, 6, 5, 5,5, 5.
    d) Develop an algorithm for reconstructing a tree from a
    given Priifer code x), x2, ..., Xn—2+
22. Letn € Z*, n > 3. If v is a vertex in K,, how many of the
n"~* spanning trees of K,, have v as a pendant vertex?                                    Figure 12.9

12.2
                       Rooted Trees
                               We turn now to directed trees. We find a variety of applications for a special type of directed
                               tree called a rooted tree.

Definition 12.2           If G is a directed graph, then G is called a directed tree if the undirected graph associated
                               with G is a tree. When        G is a directed tree, G is called a rooted tree if there is a unique
                               vertex r, called the root, in G with the in degree of r = id(r) = 0, and for all other vertices
                               v, the in degree of v = id(v) = 1.

The tree in part (a) of Fig. 12.10 is directed but not rooted; the tree in part (b) is rooted
                               with root r.

(a)
                                                         Figure 12.10
588         Chapter 12 Trees

We draw rooted trees as in Fig. 12.10(b) but with the directions understood as going
                               from the upper level to the lower level, so that the arrows aren’t needed. In a rooted tree,
                               a vertex with out degree 0 is called a leaf (or terminal vertex.) Vertices u, v, x, y, z are
                               leaves in Fig. 12.10(b). All other vertices are called branch nodes (or internal vertices).
                                   Consider the vertex s in this rooted tree [Fig. 12.10(b)]. The path from the root, r, to s is
                               of length 2, so we say that s is at /Jevel 2 in the tree, or that s has level number 2. Similarly, x
                               is at level 3, whereas y has level number 4. We call s a child of n, and we call n the parent
                               of s. Vertices w, y, and z are considered descendants of s, n, and r, while s,m, and r are
                               called ancestors of w, y, and z. In general, if v; and v2 are vertices in a rooted tree and v;
                               has the smaller level number, then v; is an ancestor of v2 (or v2 is a descendant of v_) if
                               there is a (directed) path from v; to v2. Two vertices with a common parent are referred to
                               as siblings. Such is the case for vertices g and s, whose common parent is vertex . Finally,
                               if v; is any vertex of the tree, the subtree at v, is the subgraph induced by the root v; and
                               all of its descendants (there may be nene).

In Fig. 12.11(a) a rooted tree is used to represent the table of contents of a three-chapter
      EXAMPLE 12.3
                               (C1, C2, C3) book. Vertices with level number 2 are for sections within a chapter; those
                               at level 3 represent subsections within a section. Part (b) of the figure displays the natural
                               order for the table of contents of this book.

Book                                 Book
                                                                                                              C1
                                                                       J             \                             $1.1

C1           C2            C3                    91.2

/\                            /\\               |e
                                                                                                              C2

$3.1
                                                       $1.1            $1.2          $3.1    $3.2    $3.3
                                                                                                                   $3.2
                                                                                                                      $3.2.1
                                                                                                                     $3.2.2
                                                                                    $3.2.1          $3.2.2         53.3

(a)                                                     (b)
                                                   Figure 12.11

The tree in Fig. 12.11(a) suggests an order for the vertices if we examine the subtrees
                               at Cl, C2, and C3 from left to right. (This order will recur again in this section, in a more
                               general context.) We now consider a second example that provides such an order.

In the tree T shown in Fig. 12.12, the edges (or branches, as they are often called) leaving
      EXAMPLE 12.4
                               each internal vertex are ordered from left to right. Hence T is called an ordered rocted tree.
                                                                                        12.2   Rooted Trees          589

1.2.3.1    1.2.3.2
                       Figure 12.12

We label the vertices for this tree by the following algorithm.
                    Step 1: First assign the root the label (or address) 0.
                    Step 2; Next assign the positive integers 1, 2, 3, . . . to the vertices at level 1, going~
                   from left to right,
                   Step 3: Now let pv be an internal vertex at level n > 1, and let uy, v2, ..., 0, denote
                    the children of » (going from left to right), If a is the label assigned to vertex v,
                    assign the labels a.1, a.2,..., @.% to the children v;, v2, ..., vy, respectively.

Consequently, each vertex in 7, other than the root, has a label of the form
               Q).A2.04..... a, if and only if that vertex has level number n. This is known as the universal
               address system.
                  This system provides a way to order all vertices in 7. If u and v are two vertices
               in T with addresses b and c, respectively, we define b < c if (a) b= ay.a..... Gy and
                 =).do.....      Am Amt...         es an, With m <n;     or (b) b=ay.a2.....            Am X] oes. y and
               C=a,.d>..... On XQ vv               z, where x}, x2 € Zt and x) <x.
                  For the tree under consideration, this ordering yields

0              1.2                  1.2.3        1.3    p>                     > 3
                                 [       1.2.1                1.2.3.1      1.4     :    2.2         |       3.1
                           1.1—          1.2.2—               1.2.3.2—     2     — _—   2.2,1—              3.2

Since this resembles the alphabetical ordering in a dictionary, the order is called the /exi-
               cographic, or dictionary, order.

We now consider an application of a rooted tree in the study of computer science.

a) A rooted tree is a binary rooted tree if for each vertex v, od(v) = 0, 1, or 2 that
                                                                                                   —    is,
EXAMPLE 12.5
                    if v has at most two children. If od{v) = 0 or 2 for all v € V, then the rooted tree is
                    called a complete binary tree. Such a tree can represent a binary operation, as in parts
590   Chapter 12 Trees

(a) and (b) of Fig. 12.13. To avoid confusion when dealing with a noncommutative
                            operation o, we label the root as o and require the result to be a o b, where a is the left
                            child, and 6 the right child, of the root.

+a(a+b)                      -g(a-—b)

a                  b    a                    b
                                                               (a)                             (b)

Figure 12.13

b) In Fig. 12.14 we extend the ideas presented in Fig. 12.13 in order to construct the
                            binary rooted tree for the algebraic expression

(7 — a)/5) * ((a + b) F 3),

7           a
                                     (a)

/

-           5

7           a                 a        b
                                     (b)                               (d)                     (e)

Figure 12.14

where *« denotes multiplication and t denotes exponentiation. Here we construct this
                            tree, as shown in part (e) of the figure, from the bottom up. First, a subtree for the
                            expression 7 — a is constructed in part (a) of Fig. 12.14. This is then incorporated (as
                            the left subtree for /) in the binary rooted tree for the expression (7 — a)/5 in Fig.
                            12.14 (b). Then, ina similar way, the binary rooted trees in parts (c) and (d) of the figure
                            are constructed for the expressions a + 6 and (a + b) + 3, respectively. Finally, the
                            two subtrees in parts (b) and (d) are used as the left and right subtrees, respectively, for
                            * and give us the binary rooted tree [in Fig. 12.14(e)] for (7 — a)/5) * ((a + b) t 3).
                               The same ideas are used in Fig. 12.15, where we find the binary rooted trees for the
                            algebraic expressions

(a — (3/b)) +5 [in part (a)]                           and         a — (3/(b + 5)) [in part (b)].
                         c) In evaluating ¢ + (uv)/(w +x — y*) in certain procedural languages, we write the
                            expression in the form f + (“4 *v)/(w +x — y tz). When the computer evaluates
                            this expression, it performs the binary operations (within each parenthesized part)
                            according to a hierarchy of operations whereby exponentiation precedes multiplication
                                                                  12.2   Rooted Trees       591

y Tz)

Ill| | |
                                                             t+ (u*v}/(w       X

© O®          ® QM
         Figure 12.15                                        Figure 12.16

and division, which in turn precede addition and subtraction. In Fig. 12.16 we number
     the operations in the order in which they are performed by the computer. For the
     computer to evaluate this expression, it must somehow scan the expression in order
     to perform the operations in the order specified.
         Instead of scanning back and forth continuously, however, the machine converts
     the expression into a notation that is independent of parentheses. This is known as
     Polish notation, in honor of the Polish (actually Ukrainian) logician Jan Lukasiewicz
     (1878-1956). Here the infix notation a o b for a binary operation o becomes o ab, the
     prefix (or Polish) notation. The advantage is that the expression in Fig. 12.16 can be
     rewritten without parentheses as

+ft/*xuv+w—x
                                                 ft yz,

where the evaluation proceeds from right to left. When a binary operation is encoun-
     tered, it is performed on the two operands to its right. The result is then treated as one
     of the operands for the next binary operation encountered as we continue to the left.
     For instance, given the assignments f = 4,u = 2,v=3,w=1,x =9,y =2,z =3,
     the following steps take place in the evaluation of the expression

+t/*xuv+w-—x
                                                 ft yz.

1)4+4/*234+1-9%f23
                                ——_—
                                 273=8
     2)+4/%*234+1-98
                              ——"
                              9—-8=1
     3) +4/*23411
                        —
                        1+1=
     4,+4/*23            2
                 2*     3=6
     5)+4      /62
               ——
               6/2=3
     6) ——
        +4 3
        44+3=7
     So the value of the given expression for the preceding assignments is 7.

The use of Polish notation is important for the compilation of computer programs and
can be obtained by representing a given expression by a rooted tree, as shown in Fig. 12.17.
Here each variable (or constant) is used to label a leaf of the tree. Each internal vertex is
592         Chapter 12   Trees

labeled by a binary operation whose left and right operands are the left and right subtrees
                                 it determines. Starting at the root, as we transverse the tree from top to bottom and left to
                                 right, as shown in Fig. 12.17, we find the Polish notation by writing down the labels of the
                                 vertices in the order in which they are visited.

Figure 12.17

The last two examples illustrate the importance of order. Several methods exist for
                                 systematically ordering the vertices in a tree. Two of the most prevalent in the study of data
                                 structures are the preorder and postorder. These are defined recursively in the following
                                 definition.

Definition 12.3            Let T = (V, E) bea rooted tree with root r. If T has no other vertices, then the root by itself
                                 constitutes the preorder and postorder traversals of T. If |V| > 1, let 11, To. 73, .... T
                                 denote the subtrees of T as we go from left to right (as in Fig. 12.18).

Ty       Ty      73           Tg
                                                                    Figure 12.18

a) The preorder    traversal   of T first visits r and   then traverses the vertices   of 7) in
                                       preorder, then the vertices of 72 in preorder, and so on until the vertices of 7; are
                                       traversed in preorder.
                                    b) The postorder traversal of T traverses in postorder the vertices of the subtrees         7,
                                       To, ..., T, and then visits the root.
                                                                                    12.2   Rooted Trees       593

We demonstrate these ideas in the following example.

Consider the rooted tree shown in Fig. 12.19.
EXAMPLE 12.6

11      12        13      14              15   16     17

Figure 12.19

a) Preorder: After visiting vertex 1 we visit the subtree 7, rooted at vertex 2. After
                    visiting vertex 2 we proceed to the subtree rooted at vertex 5, and after visiting vertex
                    5 we go to the subtree rooted at vertex 11. This subtree has no other vertices, so we
                    visit vertex 11 and then return to vertex 5 from which we visit, in succession, vertices
                    12, 13, and 14. Following this we backtrack (14 to 5 to 2 to 1) to the root and then
                    visit the vertices in the subtree 7) in the preorder 3, 6, 7. Finally, after returning to the
                    root for the last time, we traverse the subtree 73 in the preorder 4, 8, 9, 10, 15, 16, 17.
                    Hence the preorder listing of the vertices in this tree is 1, 2, 5, 11, 12, 13, 14, 3, 6, 7,
                    4, 8,9, 10, 15, 16, 17.
                         In this ordering we start at the root and build a path as far as we can. At each level
                    we go to the leftmost vertex (not previously visited) at the next level, until we reach a
                    leaf £. Then we backtrack to the parent p of this leaf ¢ and visit £’s sibling s (and the
                    subtree that s determines) directly on its right. If no such sibling s exists, we backtrack
                    to the grandparent g of the leaf @ and visit, if it exists, a vertex u that is the sibling of
                    p directly to its right in the tree. Continuing in this manner, we eventually visit (the
                    first time each one is encountered) all of the vertices in the tree.
                         The vertices in Figs. 12.11(a), 12.12, and 12.17 are visited in preorder. The preorder
                    traversal for the tree in Fig. 12.11(a) provides the ordering in Fig. 12.11(b). The
                    lexicographic order in Example 12.4 arises from the preorder traversal of the tree in
                    Fig. 12.12.
                 b) Postorder:   For the postorder traversal of a tree, we start at the root r and build the
                    longest path, going to the leftmost child of each internal vertex whenever we can. When
                    we alrive at a leaf € we visit this vertex and then backtrack to its parent p. However,
                    we do not visit p until after all of its descendants are visited. The next vertex we visit is
                    found by applying the same procedure at p that was originally applied at r in obtaining
                    £ — except that now we first go from p to the sibling of @ directly to the right (of £).
                    And at no time is any vertex visited more than once or before any of its descendants.
                        For the tree given in Fig. 12.19, the postorder traversal starts with a postorder
                    traversal of the subtree 7; rooted at vertex 2. This yields the listing 11, 12, 13, 14, 5,
                    2. We proceed to the subtree 7>, and the postorder listing continues with 6, 7, 3. Then
                    for T; we find 8, 9, 15, 16, 17, 10, 4 as the postorder listing. Finally, vertex 1 is visited.
                    Consequently, for this tree, the postorder traversal visits the vertices in the order 11,
                    12, 13, 14, 5, 2, 6, 7, 3, 8, 9, 15, 16, 17, 10, 4, 1.
594          Chapter 12 Trees

In the case of binary rooted trees, a third type of tree traversal called the inorder traversal
                                may be used. Here we do not consider subtrees as first and second, but rather in terms of
                                left and right. The formal definition is recursive, as were the definitions of preorder and
                                postorder traversals.

Definition 12.4           Let T = (V, E) be a binary rooted tree with vertex r the root.
                                   1) If |V| = 1, then the vertex r constitutes the inorder traversal of T.
                                   2) When    |V| >      1, let 7, and 7g denote the left and right subtrees of T. The         inorder
                                      traversal of T first traverses the vertices 7;, in inorder, then it visits the root r, and
                                      then it traverses, in inorder, the vertices of Tp.

We realize that here a left or right subtree may be empty. Also, if v is a vertex in sucha
                                tree and od{v) =      1, then if w is the child of v, we must distinguish between w’s being the
                                left child and its being the right child.

As a result of the previous       comments,   the two binary rooted trees shown      in Fig.     12.20
      EXAMPLE 12.7
                                are not considered the same, when viewed as ordered trees. As rooted binary trees they
                                are the same. (Each tree has the same set of vertices and the same set of directed edges.)
                                However, when we consider the additional concept of left and right children, we see that
                                in part (a) of the figure vertex v has right child a, whereas in part (b) vertex a is the left
                                child of v. Consequently, when the difference between left and right children is taken into
                                consideration, these trees are no longer viewed as the same tree.

(a)                          (b)
                                                   Figure 12.20

In visiting the vertices for the tree in part (a) of Fig. 12.20, we first visit in inorder the
                                left subtree of the root r. This subtree consists of the root v and its right child a. (Here the
                                left child is null, or nonexistent.) Since v has no left subtree, we visit in inorder vertex v
                                and then its right subtree, namely, a. Having traversed the left subtree of r, we now visit
                                vertex r and then traverse, in inorder, the vertices in the right subtree of r. This results in our
                                visiting first vertex b (because 6 has no left subtree) and then vertex c. Hence the inorder
                                listing for the tree shown in Fig. 12.20(a) is v, a, r, b,c.
                                   When we consider the tree in part (b) of the figure, once again we start by visiting, in
                                inorder, the vertices in the left subtree of the root r. Here, however, this left subtree consists
                                of vertex v (the root of the subtree) and its /eft child a. (In this case, the right child of v is
                                null, or nonexistent.) Therefore this inorder traversal first visits vertex a (the left subtree
                                of v), and then vertex v. Since v has no right subtree, we are now finished visiting the left
                                subtree of r, in inorder. So next the root r is visited, and then the vertices of the right subtree
                                                                                                        12.2   Rooted Trees       595

of r are traversed, in inorder. This results in the inorder listing a, v, r, b, c for the tree shown
                in Fig. 12.20(b).
                    We should note, however, that for the preorder traversal in this particular example,’ the
                same result is obtained for both trees:

Preorder listing:      r, v,a, b,c.

Likewise, this particular example is such that the postorder traversal for either tree gives us
                the following:

Postorder listing:        a, v, c, b, r.

It is only for the inorder traversal, with its distinctions between left and right children and
                between left and right subtrees, that a difference occurs. For the trees in parts (a) and (b) of
                Fig. 12.20 we found the respective inorder listings to be

(a)v,a,r, b,c              and          (b) a. v, r, b,c.

If we apply the inorder traversal to the binary rooted tree shown in Fig. 12.21, we find that
EXAMPLE 12.8
                the inorder listing for the vertices is p, j,g, f,c,k, g,a,d,r,b,h, s,m, e,i,t,n, u.

p        gq               5          t         u
                                                     Figure 12.21

Our next example shows how the preorder traversal can be used in a counting problem
                dealing with binary trees.

For n > 0, consider the complete binary trees on 2n + 1 vertices. The cases for 0 <n                               <3
EXAMPLE 12.9%   are shown in Fig. 12.22. Here we distinguish left from right. So, for example, the two

*A note of caution! If we interchange the order of the two existing children (of a certain parent) in a binary
                rooted tree, then a change results in the preorder, postorder, and inorder traversals. If one child is “null,” however,
                then only the inorder traversal changes.
                    + This example uses material developed in the optional Sections 1.5 and 10.5. It may be omitted with no loss
                of continuity.
596   Chapter 12 Trees

complete binary trees for n = 2 are considered distinct. [If we do not distinguish left from
                         right, these trees are (isomorphic and) no longer counted as two different trees.|

(n = 0)             (n = 1)               (n
                                                                        = 2)
                                                                                             r                                       r
                                       er                     r
                                                    /\                           a               dD                       a               b
                                                    a              b
                                                                                        C               d          c          d
                                       r                   r,a,b                      r,a,b,c,d                         r,a,c,d,b
                                                            L,R                        L,R,L, R                               L,L, R,R

(n = 3)

r,a, C, a, b, e, f    r,a,c,¢,fda,b        r,a, c,d,e, f,b                r,a, b,c, a, e, f           r,a,b,c,e, fd
                              L,L, R, R, L,R        L,L, L, R, R,R      LL,    R,L,   Rk,R             L, R, L, R,
                                                                                                                 b, R              L,R,   L, L, R,R

Figure 12.22

Below each tree in the figure we list the vertices for a preorder traversal. In addition, for
                         1 <n <3, we find a list of n L’s and n R’s under each preorder traversal. These lists are
                         determined as follows. The first tree for n = 2, for instance, has the list L, R, L, R because,
                         after visiting the root r, we go to the left (L) subtree rooted at a and visit vertex a. Then we
                         backtrack to r and go to the right (R) subtree rooted at b. After visiting vertex b we go to
                         the left (L) subtree of b rooted at c and visit vertex c. Then, lastly, we backtrack to b and
                         go to its right (R) subtree to visit vertex d. This generates the list L, R, L, R and the other
                         seven lists of L’s and R’s are obtained in the same way.
                             Since we are traversing these trees in preorder, each list starts with an L. There is an
                         equal number of L’s and R’s in each list because the trees are complete binary trees. Finally,
                         the number of R’s never exceeds the number of L’s as a given list is read from left to right —
                         again, because we have a preorder traversal. Should we replace each L by a 1 and each R
                         by a —1, for the five trees for n = 3, we find ourselves back in part (a) of Example 1.43,
                         where we have one of our early examples of the Catalan numbers. Hence,                                          for n > 0, we
                         see that the number of complete binary trees on 2n + | vertices is — (2"), the nth Catalan
                         number. [Note that if we prune the five trees for n = 3 by removing the four leaves for each
                         tree, we obtain the five rooted ordered binary trees in Fig. 10.18.]

The notion of preorder now arises in the following procedure for finding a spanning tree
                         for a connected graph.
                             Let G = (V, E) be a loop-free connected undirected graph with r € V. Starting from r,
                         we construct a path in G that is as long as possible. If this path includes every vertex in V,
                         then the path is a spanning tree 7 for G and we are finished. If not, let x and y be the last
                         two vertices visited along this path, with y the last vertex. We then return, or backtrack, to
                         the vertex x and construct a second path in G that is as long as possible, starts at x, and
                                                                                    12.2   Rooted Trees     597

doesn’t include any vertex already visited. If no such path exists, backtrack to the parent
                p of x and see how far it is possible to branch off from p, building a path (that is as long
                as possible and has no previously visited vertices) to a new vertex y, (which will be a
                new leaf for 7). Should all edges from the vertex p lead to vertices already encountered,
                backtrack one level higher and continue the process. Since the graph is finite and connected,
                this technique, which is called backtracking, or depth-first search, eventually determines a
                spanning tree T for G, where r is regarded as the root of 7. Using 7, we then order the
                vertices of G in a preorder listing.
                    The depth-first search serves as a framework around which many algorithms can be
                designed to test for certain graph properties. One such algorithm will be examined in detail
                in Section 12.5.
                    One way to help implement the depth-first search in a computer program is to assign a
                fixed order to the vertices of the given graph G = (V, E). Then if there are two or more
                vertices adjacent to a vertex v and none of these vertices has already been visited, we shall
                know exactly which vertex to visit first. This order now helps us to develop the foregoing
                description of the depth-first search as an algorithm.
                    Let G = (V, E) be a loop-free connected undirected graph where |V| = n and the ver-
                tices are ordered as v), V2, U3, ..., U,. To find the rooted ordered depth-first spanning tree
                for the prescribed order, we apply the following algorithm, wherein the variable v is used
                to store the vertex presently being examined.

Depth-First Search Algorithm
                     Step 1: Assign uv; to the variable v and initialize T as the tree consisting of just
                     this one vertex. (The vertex v; will be the root of the spanning tree that develops.)
                     Visit v4.
                     Step 2: Select the smallest subscript i, for 2 <i <n, such that (v, oj} € E and v;
                     has not already been visited.

If no such subscript is found, then go to step (3). Otherwise, perform the follow-
                     ing: (1) Attach the edge {v, v;} to the tree T and visit v;; (2) Assign v; to v; and
                     (3) Return to step (2).

Step 3: If v = v;, the tree T is the (rooted ordered) spanning tree for the order
                     specified.
                     Step 4: For v # v,, backtrack from v to its parent wu in 7, Then assign u to v and
                     return to step (2).

We now apply this algorithm to the graph G = (V, E) shown in Fig. 12.23(a). Here the
EXAMPLE 12.10
                order for the vertices is alphabetic: a, b,c, d,e, f, g, A, i, j.
                   First we assign the vertex a to the variable v and initialize T as just the vertex a (the
                root). We visit vertex a. Then, going to step (2), we find that the vertex b is the first vertex
                w such that {a, w} € EF and w has not been visited earlier. So we attach edge {a, b} to T
                and visit b, assign } to v, and then return to step (2).
                   At v = b we find that the first vertex (not visited earlier) that provides an edge for the
                spanning tree is d. Consequently, the edge {b, d} is attached to T and d is visited, then d is
                assigned to v, and we again return to step (2).
598   Chapter 12 Trees

(a)           G =(V, £)
                                       Figure 12.23

This time, however, there is no new vertex that we can obtain from d, because vertices
                         a and b have already been visited. So we go to step (3). But here the value of v is d, not a,
                         and we go to step (4). Now we backtrack from d, assigning the vertex b to v, and then we
                         return to step (2). At this time we add the edge {b, e} to T and visit e.
                             Continuing the process, we attach the edge {e, f} (and visit f) and then the edge {e, h}
                         (and visit h). But now the vertex h has been assigned to v, and we must backtrack from
                         h to e to b to a. When v is assigned the vertex a this (second) time, the new edge {a, c}
                         is obtained and vertex c is visited. Then we proceed to attach the edges {c, g}, {g, i}, and
                         {g, J} (visiting the vertices g, i, and j, respectively). At this point all of the vertices in G
                         have been visited, and we backtrack from / to g toc toa. With v = a once again we return
                         to step (2) and from there to step (3), where the process terminates.
                             The resulting tree 7 = (V, £;) is shown in part (b) of Fig. 12.23. Part (c) of the figure
                         shows the tree 7’ that results for the vertex ordering: j,i, h, g, f, e, d,c, b, a.

A second method for searching the vertices of a loop-free connected undirected graph is
                         the breadth-first search. Here we designate one vertex as the root and fan out to all vertices
                         adjacent to the root. From each child of the root we then fan out to those vertices (not
                         previously visited) that are adjacent to one of these children. As we continue this process,
                         we    never list a vertex twice,     so no cycle   is constructed,   and   with   G   finite the process
                         eventually terminates.
                             We actually used this technique earlier in Example 11.28 of Section 11.5.
                             Acertain data structure proves useful in developing an algorithm for this second searching
                         method. A queue is an ordered list wherein items are inserted at one end (called the rear) of
                         the list and deleted at the other end (called the front). The first item inserted in the queue is
                         the first item that can be taken out of it. Consequently, a queue is referred to as a “first-in,
                         first-out,” or FIFO, structure.
                             As in the depth-first search, we again assign an order to the vertices of our graph.
                             We start with a loop-free connected undirected graph G = (V, FE), where |V| = n and
                         the vertices are ordered as v), V2, V3, ..., U,. The following algorithm generates the (rooted
                         ordered) breadth-first spanning tree T of G for the given order.

Breadth-First Search Algorithm
                               Step 1: Insert vertex v; at the rear of the (initially empty) queue Q and initialize T
                               as the tree made up of this one vertex v; (the root of the final version of T). Visit vj.
                                                                                   12.2   Rooted Trees      599

Step 2: While the queue Q is not empty, delete the vertex v from the front of Q.
                     Now examine the vertices vu; (for 2 <i <n) that are adjacent to v-— in the specified
                     order. If v; has not been visited, perform the following: (1) Insert uv; at the rear of
                     Q; (2) Attach the edge {v, v;} to 7; and (3) Visit vertex v;. [If we examine all of
                     the vertices previously in the queue Q and obtain no new edges, then the tree T
                     (generated to this point) is the (rooted ordered) spanning tree for the given order.]

We shall employ the graph of Fig. 12.23(a) with the prescribed order a, b, c, d, e, f, g, h,
EXAMPLE 12.11
                i, j to illustrate the use of the algorithm for the breadth-first search.
                    Start with vertex a. Insert a at the rear of (the presently empty) queue Q, initialize T as
                this one vertex (the root of the resulting tree), and visit vertex a.
                    In step (2) we now delete a from (the front of) Q and examine the vertices adjacent to
                a—namely,     the vertices b, c, d. (These vertices have not been previously visited.) This
                results in our (i) inserting vertex b at the rear of Q, attaching the edge {a, b} to 7, and
                visiting vertex b; (ii) inserting vertex c at the rear of Q (after b), attaching the edge {a. c}
                to 7, and visiting vertex c; and (ili) inserting vertex d at the rear of Q (after c), attaching
                the edge {a, d} to 7, and visiting vertex d.
                    Since the queue Q is not empty, we execute step (2) again. Upon deleting vertex b from
                the front of Q, we now find that the only vertex adjacent to b (that has not been previously
                visited) is e. So we insert vertex e at the rear of Q (after d), attach the edge {b, e} to T,
                and visit vertex e. Continuing with vertex c we obtain the new (unvisited) vertex g. So we
                insert vertex g at the rear of Q (after ¢), attach the edge {c, g} to 7, and visit vertex g.
                And now we delete vertex d from the front of Q. But at this point there are no unvisited
                vertices adjacent to d, so we then delete vertex e from the front of Q. This vertex leads
                to the following: inserting vertex f at the rear of Q (after g), attaching the edge {e, f} to
                T, and visiting vertex f. This is followed by: inserting vertex h at the rear of Q (after f),
                attaching edge {e, h} to T, and visiting vertex #. Continuing with vertex g, we insert vertex
                i at the rear of Q (after 4), attach edge {g, i} to T, and visit vertex i, and then we insert
                vertex j at the rear of Q (after i), attach edge {g, j} to T, and visit vertex j.
                    Once again we return to the beginning of step (2). But now when we delete (from the
                front of Q) and examine each of the vertices f, h, i, and j (in this order), we find no
                unvisited vertices for any of these four vertices. Consequently, the queue Q now remains
                empty and the tree T in Fig. 12.24(a) is the breadth-first spanning tree for G, for the order

Figure 12.24
600           Chapter 12 Trees

prescribed. (The tree 7), shown in part (b) of the figure, arises for the order j,7,h, 2, f, e,
                                 d,c, b, a.)

Let us apply these ideas on graph searching to one more example.

Let G =(V, £) be an undirected graph (with loops) where the vertices are ordered as
      EXAMPLE 12.12              v1, V2,..., U7. If Fig. 12.25(a) is the adjacency matrix A(G) for G, how can we use this
                                 representation of G to determine whether G is connected, without drawing the graph?

Vy                                         Vy

V1 V2 V3 V4 V5 Vg V7                V2                  V7                    V2
                                                   Vy    0100001
                                                   vy}   1111000
                                                   3}/0110000                         v3 ¢                  “a                   "38                   “4
                                        AG)= %44}/ 0100101
                                                   v4}   0001010                                                 Vs                          Vs             v7
                                                   v6}   000010           0
                                                   v7}   1001000
                                                                                                                       V6              YE

Breadth-first                        Depth-first
                                                                                                     search                                 search
                                      (a)                                       (b)                                            (c)
                                    Figure 12.25

Using v, as the root, in part (b) of the figure we search the graph by means of its adjacency
                                 matrix, using a breadth-first search. [Here we ignore the loops by ignoring any 1|’s on the
                                 main diagonal (extending from the upper left to the lower right).] First we visit the vertices
                                 adjacent to v1, listing them in ascending order according to the subscripts on the v’s in A(G).
                                 The search continues, and as all vertices in G are reached, G is shown to be connected.
                                    The same conclusion follows from the depth-first search in part (c). The tree here also
                                 has v, as its root. As the tree branches out to search the graph, it does so by listing the first
                                 vertex found adjacent to v; according to the row in A(G)                             for v,. Likewise, from                     v2 the
                                 first new vertex in this search is found from A(G) to be v3. The vertex v3 is a leaf in this
                                 tree because no new vertex can be visited from v3. As we backtrack to v2, row 2 of A(G)
                                 indicates that v4 can now be visited from v2. As this process continues, the connectedness
                                 of G follows from part (c) of the figure.

It is time now to return to our main discussion on rooted trees. The following definition
                                 generalizes the ideas that were introduced for Example 12.5.

Definition 12.5           Let T = (V, E) be a rooted tree, and let m € Z*.
                                    We call T an m-ary tree if od(v) < m         for all v € V. When                        m = 2, the tree is called a
                                 binary tree.
                                    If od(v) = O orm, for all v € V, then T is called a complete m-ary tree. The special case
                                 of m = 2 results in a complete binary tree.
                                                                                          12.2   Rooted Trees   601

In a complete m-ary tree, each internal vertex has exactly m children. (Each leaf of this
                     tree still has no children.)
                         Some properties of these trees are considered in the following theorem.

THEOREM 12.6         Let T = (V, E) be a complete m-ary tree with |V| =n. If T has @ leaves and i inter-
                     nal vertices, then (a) n = mi + 1; (b) €=          (m—1)i   +1;   and (©) i =(-1)/mM—-)D=
                     (n — 1)/m.
                     Proof: This proof is left for the Section Exercises.

| EXAMPLE 12.13      The Wimbledon tennis championship is a single-elimination tournament wherein a player
                     (or doubles team) is eliminated after a single loss. If 27 women compete in the singles
                     championship, how many matches must be played to determine the number-one female
                     player?
                         Consider the tree shown in Fig. 12.26. With 27 women competing, there are 27 leaves in
                     this complete binary tree, so from Theorem 12.6(c) the number of internal vertices (which
                     is the number of matches) isi = (€ — 1)/(m — 1) = (27 — 1)/(2— 1) = 26.

The
                                            champion

The
                                                           semifinals

The
                                                                quarterfinals
                                            Figure 12.26

A classroom contains 25 microcomputers that must be connected to a wall socket that has
  EXAMPLE 12.14
                     four outlets. Connections are made by using extension cords that have four outlets each.
                     What is the least number of cords needed to get these computers set up for class use?
                         The wall socket is considered the root of a complete m-ary tree for m = 4. The micro-
                     computers are the leaves of this tree, so £ = 25. Each internal vertex, except the root, corre-
                     sponds with an extension cord. So by part (c) of Theorem 12.6, there are (€ — 1)/(m — 1) =
                     (25 — 1)/(4 — 1) = 8 internal vertices. Hence we need 8 — 1 (where the | is subtracted for
                     the root) = 7 extension cords.

Definition 12.6   If T = (V, E) is a rooted tree and h is the largest level number achieved by a leaf of T,
                     then T is said to have height h. A rooted tree T of height h is said to be balanced if the
                     level number of every leaf in T is h — 1 orh.
602         Chapter 12 Trees

The rooted tree shown in Fig. 12.19 is a balanced tree of height 3. Tree 7’ in Fig. 12.23(c)
                               has height 7 but is not balanced. (Why?)
                                  The tree for the tournament in Example 12.13 must be balanced so that the tournament
                               will be as fair as possible. If it is not balanced, some competitor will receive more than one
                               bye (an opportunity to advance without playing a match).

Before stating our next theorem, let us recall that for all x € R, |x| denotes the greatest
                               integer in x, or floor of x, whereas [x] designates the ceiling of x.

THEOREM 12.7                   Let    T =(V,    E) be a complete m-ary     tree of height fA with   & leaves. Then   £ < m"     and
                               h > [log,, €].
                               Proof: The proof that £ < m" will be established by induction on h. When h        = 1, T is a tree
                               with a root and m children. In this case 2 = m = m", and the result is true. Assume the result
                               true for all trees of height < h, and consider a tree T with height / and £ leaves. (The level
                               numbers that are possible for these leaves are 1, 2,..., 4, with at least m of the leaves at
                               level h.) The € leaves of T are also the £ leaves (total) for the m subtrees 7;, 1 <i < m, of
                               T rooted at each of the children of the root, For 1 <i < m, let £; be the number of leaves in
                               subtree 7;. (In the case where leaf and root coincide, £; = 1. But sincem > 1 andh —           1 > 0,
                               we have m"~! > 1 = £;.) By the induction hypothesis, €; < m'@) < m'~!, where h(T;)
                               denotes the height of the subtree 7;, and so      = €; + £2 +--++      £m <m(m"~!)     =m",
                                  With £ <m",       we find that log,, £ <log,,(m") = h, and since h € Z*, it follows that
                               h > [log,, £].

COROLLARY 12.1                 Let T be a balanced complete m-ary tree with ¢ leaves. Then the height of T is [log,, €].
                               Proof: This proof is left as an exercise.

We close this section with an application that uses a complete ternary (mm = 3) tree.

Decision Trees. There are eight coins (identical in appearance) and a pan balance. If exactly
      EXAMPLE 12.15
                               one of these coins is counterfeit and heavier than the other seven, find the counterfeit coin.
                                  Let the coins be labeled 1, 2, 3, ..., 8. In using the pan balance to compare sets of coins
                               there are three outcomes to consider: (a) the two sides balance to indicate that the coins in
                               the two pans are not counterfeit; (b) the left pan of the balance goes down, indicating that
                               the counterfeit coin is in the left pan; or (c) the right pan goes down, indicating that it holds
                               the counterfeit coin.
                                   In Fig. 12.27(a), we search for the counterfeit coin by first balancing coins 1, 2, 3, 4
                               against 5, 6, 7, 8. If the balance tips to the right, we follow the right branch from the root to
                               then analyze coins 5, 6 against 7, 8. If the balance tips to the left, we test coins 1, 2 against
                               3, 4. At each successive level, we have half as many coins to test, so at level 3 (after three
                               weighings) the heavier counterfeit coin has been identified.
                                  The tree in part (b) of the figure finds the heavier coin in two weighings. The first weighing
                               balances coins 1, 2, 3 against 6, 7, 8. Three possible outcomes can occur: (i) the balance tips
                               to the right, indicating that the heavier coin is 6, 7, or 8, and we follow the right branch from
                               the root; (i1) the balance tips to the left and we follow the left branch to find which of 1, 2,
                               3 is the heavier; or (iii) the pans balance and we follow the center branch to find which of
                               4, 5 is heavier. At each internal vertex the label indicates which coins are being compared.
                                                                                                                   12.2   Rooted Trees        603

11,2, 3,4'-°5, 6, 7, 8                                   11, 2, 3}—:6, 7, 8]

‘Tt    {2}     {3h   145)        6
                                                  Binary decision tree                                  Ternary decision tree
                                  (a) (Height = 3)                                           (b} (Height = 2)

Figure 12.27

Unlike part (a), a conclusion may be deduced in part (b) when a coin is not included in a
                                 weighing. Finally, when comparing coins 4 and 5, because equality cannot take place we
                                 label the center leaf with ¥.
                                    In this particular problem, we claim that the height of the complete ternary tree used must
                                 be at least 2. With eight coins involved, the tree will have at least eight leaves. Consequently,
                                 with £ > 8, it follows from Theorem 12.7 that h > [log, €] > [log, 8] = 2, so at least two
                                 weighing are needed. If n coins are involved, the complete ternary tree will have £ leaves
                                 where £ > n, and its height # satisfies h > [log, n].

f) What is the level number of vertex f?
                                                                                  g) Which vertices have level number 4?
1. Answer    the following   questions       for the tree shown         in    2. Let T = (V, E) bea binary tree. In Fig. 12.29 we find the
Fig. 12.28.                                                                   subtree of T rooted at vertex p. (The dashed line coming into
                                                                              vertex p indicates that there is more to the tree 7 than what
                                                                              appears in the figure.) If the level number for vertex u is 37,
                                                                              (a) what are the level numbers for vertices p, 5, f, v, w, xX, y,
                                                                              and z? (b) how many ancestors does vertex u have? (c) how
                                                                              many ancestors does vertex y have?

k       pqs             t
                  Figure 12.28

a) Which vertices are the leaves?

b) Which vertex is the root?
    c) Which vertex is the parent of g?                                                               Figure 12.29
    d) Which vertices are the descendants of c?
                                                                                . a) Write the expression       (w +x — y)/(a *z3)       in Polish
    e) Which vertices are the siblings of s?                                      notation, using a rooted tree.
604              Chapter 12 Trees

b) What is the value of the expression (in Polish notation)                                  Vy   V2        V3   Vga     Vs   Ve    V7   Vy

/ta-—bce+dxef,ifa=c=d=e=2,b=f=4?                                                 vz,/        O     1 0 0 0                     0     1 989
                                                                                       vo}          1   1  0 1 1                     0     1    ~0
  4. Let T = (V, E) be a rooted tree ordered by a universal ad-
                                                                                       w3/         0    0  0 1 0                      1   0 «41
dress system. (a) If vertex v in T has address 2.1.3.6, what is the
                                                                                       vu}         O    1  1 0 0                     0    0     0
smallest number of siblings that v must have? (b) For the vertex
                                                                                       vy]         O    1  0 0 0                     0     1    +9
v in part (a), find the address of its parent. (c) How many an-
                                                                                       vu}         0    0  1 0 0                      1   0     0
cestors does the vertex v in part (a) have? (d) With the presence
                                                                                       vw}          |   10   0 1                     0    0     0
of v in 7, what other addresses must there be in the system?
                                                                                       vy          0    O  100                       0    0     0
  5. For the tree shown in Fig. 12.30, list the vertices accord-
                                                                      Use a breadth-first search base                         on A(G) to determine whether

iol
ing to a preorder traversal, an inorder traversal, and a postorder
                                                                      G is connected.
traversal.
                                                                      10. a) Let T = (V, E) bea binary tree. If |V| = n, what is the
                                                                          maximum height that 7 can attain?
                                                                          b) If T = (V, E) is a complete binary tree and |V| =x,
                                                                          what is the maximum height that 7 can reach in this case?
                                                                      11. Prove Theorem 12.6 and Corollary 12.1.

12. With m,n, i, € as in Theorem 12.6, prove that
                                                                          a) n=(m£—1)/(m—                              1).     b) €=[(m—- 1)n4+1]/m.
                                                                      13. a) A complete ternary (or 3-ary) tree T = (V, E) has 34
                                                                          internal vertices. How many edges does 7 have? How
                                                                          many leaves?
                                                                          b) How many internal vertices does a complete 5-ary tree
                                                                          with 817 leaves have?
                                                                      14. The complete binary tree T = (V, E) has V = {a, b,c,
                                                                      ...,1, J, k}. The postorder listing of V yields d, e, b, h, i,
         Figure 12.30                                                 Ft, j,k, g, ¢, a. From this information draw 7 if (a) the height
                                                                      of T is 3; (b) the height of the left subtree of 7 is 3.
  6. List the vertices in the tree shown in Fig. 12.31 when they      15. For m > 3, a complete m-ary tree can be transformed into a
are visited in a preorder traversal] and in a postorder traversal.    complete binary tree by applying the idea shown in Fig. 12.32.
                                                                          a) Use this technique to transform the complete ternary
                                                                          decision tree shown in Fig. 12.27(b).
                                                                          b) If 7 is a complete quaternary tree of height 3, what is
                                                                          the maximum height that 7 can have after it is transformed
                                                                          into a complete binary tree? What is the minimum height?
                                                                          c) Answer part (b) if 7 is a complete m-ary tree of
                                                                          height A.

14             15   16   17
            Figure 12.31

7. a) Find the depth-first spanning tree for the graph
    shown in Fig. 11.72(a) if the order of the vertices is
    given as (i) a, b,c, d,e, f, g, hi Gi) h, g, f, e, d,c, b, a;
    (ili) a,b, c,d, h, g, f,e.                                              5;   $2.          53             Sm

b) Repeat part (a) for the graph shown in Fig. 11.85(i).
8. Find the breadth-first spanning trees for the graphs and pre-
scribed orders given in Exercise 7.
  9. LetG = (V, FE) be an undirected graph with adjacency ma-
trix A(G) as shown here.                                                    Figure 12.32
                                                                                                    12.3 Trees and Sorting        605

16. a) Ata men’s singles tennis tournament, each of 25 players        23. Consider the following algorithm where the input is arooted
    brings a can of tennis balls. When a match is played, one         tree with root r.
    can of balls is opened and used, then kept by the loser. The      Step 1:   Push r onto the (empty) stack
    winner takes the unopened can on to his next match. How           Step 2:   While the stack is not empty
    many cans of tennis balls will be opened during this tour-                      Pop the vertex at the top of
    nament? How many matches are played in the tournament?                             the stack and record its label
    b) In how many matches did the tournament champion                              Push the children — going from
    play?                                                                              right to left — of this vertex
17. What is the maximum number of internal vertices that a                             onto the stack
complete quaternary tree of height 8 can have? What is the            (The stack data structure was explained in Example 10.43).
number for a complete m-ary tree of height 2?
18. On the first Sunday of 2003 Rizzo and Frenchie start a chain          What is the output when this algorithm is applied to (a) the
letter, each of them sending five letters (to ten different friends   tree in Fig. 12.19? (b) any rooted tree?
between them). Each person receiving the letter is to send five
copies to five new people on the Sunday following the letter’s        24. Consider the following algorithm where the input is a rooted
arrival. After the first seven Sundays have passed, what is the       tree with root r.
total number of chain letters that have been mailed? How many         Step 1:   Push, onto the (empty) stack
were mailed on the last three Sundays?                                Step 2:   While the stack is not empty
19. Use a complete ternary decision tree to repeat Example                         If the entry at the top of the stack is
12.15 for a set of 12 coins, exactly one of which is heavier (and                      not marked
counterfeit).                                                                          Then mark it and push its
20. Let T = (V, E) be a balanced complete m-ary tree of                                    children — right to left — onto
height # > 2. If T has @ leaves and b,_; internal vertices at                              the stack
level h — 1, explain why £ = m"~' + (m — 1)b,-1.                                   Else
                                                                                        Pop the vertex at the top of the
21. Consider the complete binary trees on 31 vertices. (Here
                                                                                           stack and record its label
we distinguish left from right as in Example 12.9.) How many
of these trees have 11 vertices in the left subtree of the root?      What is the output when the algorithm is applied to (a) the tree
How many have 21 vertices in the right subtree of the root?           in Fig. 12.19? (b) any rooted tree?
22. Forn > 0, let a, count the number of complete binary trees
on 2n + 1 vertices. (Here we distinguish left from right as in
Example 12.9.) How is a@,4) related to ag, @}, @2,..., An—\, An?

12.3
                   Trees and Sorting
                                In Example 10.5, the bubble sort was introduced. There we found that the number of
                                comparisons needed to sort a list of m items is n(m — 1)/2. Consequently, this algorithm
                                determines a function h: Zt > R defined by h(n) = n(n — 1)/2. This is the (worst-case)
                                time-complexity function for the algorithm, and we often express this by writing h € O(n’).
                                Consequently, the bubble sort is said to require O(n?) comparisons. We interpret this to
                                mean that for large n, the number of comparisons is bounded above by cn’, where c is a
                                constant that is generally not specified because it depends on such factors as the compiler
                                and the computer that are used.
                                    In this section we shall study a second method for sorting a given list of m items into
                                ascending order. The method is called the merge sort, and we shall find that the order of
                                its worst-case time-complexity function is O(n log, n). This will be accomplished in the
                                following manner:
                                    1) First we shall measure the number of comparisons needed when n is a power of 2.
                                       Our method will employ a pair of balanced complete binary trees.
606              Chapter 12. Trees

2) Then we shall cover the case for general n by using the optional material on divide-
                                           and-conquer algorithms in Section 10.6.

For the case where 7 is an arbitrary positive integer, we start by considering the following
                                     procedure.
                                         Given a list of 7 items to sort into ascending order, the merge sort recursively splits the
                                     given list and all subsequent sublists in half (or as close as possible to half) until each sublist
                                     contains a single element. Then the procedure merges these sublists in ascending order until
                                     the original    items have been so sorted. The splitting and merging processes can best be
                                     described by a pair of balanced complete binary trees, as in the next example.

EXAMPLE       12.16            Merge Sort. Using the merge sort, Fig. 12.33 sorts the list 6, 2, 7, 3, 4, 9, 5, 1, 8. The tree
                       :             at the top of the figure shows how the process first splits the given list into sublists of size
                                     1. The merging process is then outlined by the tree at the bottom of the figure.

6,2,7,3,4-9,5,1,8

6,2,7    -—3,4                     9,5-1,8

6,2-7               3-4             9-5              1-8

6-2             7   3         4     9           5   1              8

6      2

6      2

2,6            7   3         4 9               5   1          /

2,6, 7           3,4             2,9              1,8

2, 3,4, 6,7                        1,5, 8,9

1,2, 3,4, 5, 6, 7, 8, 9

Figure 12.33

To compare the merge sort to the bubble sort, we want to determine its (worst-case)
                                     time-complexity function. The following lemma will be needed for this task.

LEMMA     12.1                       Let L; and L> be two sorted lists of ascending numbers, where L; contains n; elements, for
                                     i = 1, 2. Then L, and L» can be merged into one ascending list L using at most;                  +n   — 1
                                     comparisons.
                                     Proof: To merge L;, L> into list L, we perform the following algorithm.
                                                                            12.3 Trees and Sorting               607

Step 1: Set L equal to the empty list¥
     Step 2: Compare the first elements in Li, La, Remove the smaller of the two from
     the list it is in and place it at the end of L.
     Step 3: For the present lists L,, Lo [one change is made in one of these lists each
     time step (2) is executed], there are two considerations.
         a) If either of L,, L2 is empty, then the other list is concatenated to the end
             of L. This completes the merging process.
         b) If not, return to step (2).

Each comparison of a number from                L,   with one from      L> results in the placement of an
element at the end of list L, so there cannot be more than n; + m2 comparisons. When one
of the lists £,; or Ly becomes empty no further comparisons are needed, so the maximum
number of comparisons needed is 2; + n> — 1.

To determine the (worst-case) time-complexity function of the merge sort, consider a
list of m elements. For the moment, we do not treat the general problem, assuming here
that n = 2".* In the splitting process, the list of 2’ elements is first split into two sublists of
size 2'~', (These are the level | vertices in the tree representing the splitting process.) As
the process continues, each successive list of size 2" ~* h > k, is at level k and splits into
two sublists of size (1/2)(2"~*) = 2"~*—!. At level h the sublists each contain 2"~" = |
element.
   Reversing the process, we first merge the n = 2" leaves into 2"—' ordered sublists of
size 2. These sublists are at level h — 1 and require (1/2)(2") = 2’~' comparisons (one per
pair). As this merging process continues, at each of the 2* vertices at level k, 1 <k                            <h,
there is a sublist of size 2"~*, obtained from merging the two sublists of size 2’~*—' at
its children (on level k + 1). From Lemma                12.1, this merging requires at most 2"~*-! +
2h-k-1 _ | = 2h-k _ | comparisons. When the children of the root are reached, there are
two sublists of size 2’~! (at level 1). To merge these sublists into the final list requires at
most 2"~! + 24! _ | = 2" — | comparisons.
   Consequently, for | < k <h, at level k there are gk-l pairs of vertices. At each of these
vertices is a sublist of size 2’~*, so it takes at most 2’~*+' — 1 comparisons to merge each
pair of sublists. With 2*~! pairs of vertices at level &, the total number of comparisons at
level k is at most 2*~'(2"-*+! — 1). When we sum over all levels k, where 1 <k <h, we
find that the total number of comparisons is at most
      h                              h—-|                      h-|
    ders!                 — 1) = So kak * 1) = Sy 2! St                              -=h-2"—(2" ~ 1).
     k=1                             k=0                       k=0          k=0

With n = 2", we have h = log, n and
                  h.2"
                    — (2° — 1) =nlogyn—(n—1) =nlogyn—n+l,

“The result obtained here forn   = 2" Jf &N. is actually true forall n € Z* . However,   the derivation for general
n requires the optional material in Section 10.6. That is why this counting argument is included here
                                                                                                   — for the
benefit of those readers who did not cover Section 10.6.
608   Chapter 12 Trees

where n log, n is the dominating term for large n. Thus the (worst-case) time-complexity
                         function for this sorting procedure is g(n) =n log,n —n+ 1 and g € O(m log, n), for
                         n = 2", h © Z*. Hence the number of comparisons needed to merge sort a list of n items
                         is bounded above by dn log, n for some constant d, and for all n > no, where no is some
                         particular (large) positive integer.

To show that the order of the merge sort is O(n log, n) for all n € Z*, our second
                         approach will use the result of Exercise 9 from Section 10.6. We state that now:
                            Let a, b, c € Z*, with b > 2. If g: Z* > Rt U {0} is a monotone increasing function,
                         where

gl) <e,
                                                 g(n) <ag(nfb)+cn,           forn=b', heZ,

then for the case where a = b, we have g € O(n log n), for all n € Z*. (The base for the
                         log function may be any real number greater than |. Here we shall use the base 2.)
                             Before we can apply this result to the merge sort, we wish to formulate this sorting
                         process (illustrated in Fig. 12.33) as a precise algorithm. To do so, we call the procedure
                         outlined in Lemma 12.1] the “merge” algorithm. Then we shall write “merge (L,, £2)” in
                         order to represent the application of that procedure to the lists L;, £2, which are in ascending
                         order.
                             The algorithm for merge sort is a recursive procedure because it may invoke itself. Here
                         the input is an array (called List) of n items, such as real numbers.

The MergeSort Algorithm
                              Step 1: Ifn = 1, then List is already sorted and the process terminates. Ifn > 1, then
                              go to step (2).
                              Step 2: (Divide the array and sort the subarrays.) Perform the following:
                                       1) Assign m the value [n/2].
                                       2) Assign to List 1 the subarray
                                                                List{1], List{2], ..., List{m].

3) Assign to List 2 the subarray |
                                                          Listfmm + 1], Listfm + 2], ..., List[7].

4) Apply MergeSort to List 1 (of size m) and to List 2 (of size n — m).
                              Step 3: Merge (List 1, List 2).

The function g: Z* > R* U {0} will measure the (worst-case) time-complexity for this
                         algorithm by counting the maximum number of comparisons needed to merge sort an array
                         ofn items. For n = 2", h € Z*+, we have

g(n) = 2g(n/2) + [(n/2) + (n/2) — IJ.

The term 2g(n/2) results from step (2) of the MergeSort algorithm, and the summand
                         [(2/2) + (n/2) — 1] follows from step (3) of the algorithm and Lemma 12.1.
                                                                                            12.4 Weighted Trees and Prefix Codes                     609

With g(1) = 0, the preceding equation provides the inequalities

g(1) =0<1,
                                                        a(n) = 2g(n/2) + (n — 1) S 2g(n/2) +70,                forn=2"' heZt.

We       also observe that g(1) = 0, g(2) = 1, g(3) = 3, and g(4) = 5, so g(1) < g(2) <
                                      g(3) < 9(4). Consequently, it appears that g may be a monotone increasing function. The
                                      proof that it is monotone increasing is similar to that given for the time-complexity function
                                      of binary search. This follows Example 10.49 in Section 10.6, so we leave the details
                                      showing that g is monotone increasing to the Section Exercises.
                                         Now witha = b = 2 andc = 1, the result stated earlier implies that g € O(n log, n) for
                                      allneZ.

Although n log, n <n? for all n € Z*, it does not follow that because the bubble sort is
                                      O(n’) and the merge sort is O(n log, n), the merge sort is more efficient than the bubble sort
                                      for all n € Zt. The bubble sort requires less programming effort and generally takes less
                                      time than the merge sort for small values of n (depending on factors such as the programming
                                      language, the compiler, and the computer). However, as n increases, the ratio of the worst-
                                      case running times, as measured by (cn”)/(dn log, n) = (c/d)(n/log, n), gets arbitrarily
                                      large. Consequently, as the input list increases in size, the O(n?)                 algorithm (bubble sort)
                                      takes significantly more time than the O(n log, 1) algorithm (merge sort).

For more on sorting algorithms and their time-complexity functions, the reader should
                                      examine [1], [3], [4], [7], and [8] in the chapter references.

3. Related to the merge sort is a somewhat more efficient
                                                                             procedure called the guick sort. Here we start with a list
                                                                             L:@\,@,...,     d,,   and   use    a,   as   a   pivot   to   develop   two
1. a) Give an example of two lists Z,, L2, each of which is in
                                                                             sublists L, and L> as follows. For i > 1, if a, <a), place a,
  ascending order and contains five elements, and where nine                 at the end of the first list being developed (this is L, at the end
  comparisons are needed to merge L,, L> by the algorithm                    of the process); otherwise, place a, at the end of the second
  given in Lemma 12.1.
                                                                             list   Lo.

b) Let m,n € Z* with m < n. Give an example of two lists                        After all a,, i > 1, have been processed, place a, at the end
  L,, L2, each of which is in ascending order, where L; has                  of the first list. Now apply quick sort recursively to each of the
  m elements, L2 has n elements, and m +n — 1 compari-                       lists L, and L> to obtain sublists L1,, Liz, L2;, and Ly. Con-
  sons are needed to merge           L,, L2 by the algorithm given in        tinue the process until each of the resulting sublists contains one
  Lemma     12.1.                                                            element. The sublists are then ordered, and their concatenation
                                                                             gives the ordering sought for the original list L.
2. Apply the merge sort to each of the following lists. Draw the                 Apply quick sort to each list in Exercise 2.
splitting and merging trees for each application of the procedure.
                                                                             4. Prove that the function g used in the second method to an-
  a)   —l, 0, 2,    —2,   3, 6,   —3, 5,   1,   4                            alyze the (worst-case) time-complexity of the merge sort is
  b) -1, 7, 4, 11, 5, —8, 15, —3, —2, 6, 10, 3                               monotone increasing.

12.4
       Weighted Trees and Prefix Codes
                                      Among the topics to which discrete mathematics is applied, coding theory is one wherein
                                      different finite structures play a major role. These structures enable us to represent and
                                      transmit information that is coded in terms of the symbols in a given alphabet. For instance,
                                      the way we most often code, or represent, characters internally in a computer is by means
                                      of strings of fixed length, using the symbols 0 and 1.
610   Chapter 12 Trees

The codes developed in this section, however, will use strings of different lengths. Why a
                         person should want to develop such a coding scheme and how the scheme can be constructed
                         will be our major concerns in this section.

Suppose we wish to develop a way to represent the letters of the alphabet using strings
                         of 0’s and 1’s. Since there are 26 letters, we should be able to encode these symbols in terms
                         of sequences of five bits, given that 2* < 26 < 2°. However, in the English (or any other)
                         language, not all letters occur with the same frequency. Consequently, it would be more
                         efficient to use binary sequences of different lengths, with the most frequently occurring
                         letters (such as e, i, f) represented by the shortest possible sequences. For example, consider
                         S = {a, e,n, r,t}, a subset of the alphabet.         Represent      the elements      of S by the binary
                         sequences

a: 01        e:0         n: 101     r: 10          t: 1.

If the message “ata” is to be transmitted, the binary sequence 01101 is sent. Unfortunately,
                         this sequence is also transmitted for the messages “etn”,            “atet”, and “an”.
                            Consider a second encoding scheme, one given by

a: 111       e: 0         n: 1100     r: 1101           t: 10.

Here the message “ata” 1s represented by the sequence 11110111 and there are no other
                         possibilities to confuse the situation. What’s more, the labeled complete binary tree shown
                         in Fig. 12.34 can be used to decode the sequence 11110111. Starting at the root, traverse the
                         edge labeled 1 to the right child (of the root). Continuing along the next two edges labeled
                         with 1, we arrive at the leaf labeled a. Hence the unique path from the root to the vertex
                         at a is unambiguously determined by the first three 1’s in the sequence 11110111. After
                         we return to the root, the next two symbols in the sequence — namely, 10 — determine the
                         unique path along the edge from the root to its right child, followed by the edge from that
                         child to its left child. This terminates at the vertex labeled t. Again returning to the root,
                         the final three bits of the sequence determine the letter a for a second time. Hence the tree
                         “decodes” 11110111 as ata.

a. 111

r: 1101

Figure 12.34

Why did the second encoding scheme work out so readily when the first led to ambigu-
                         ities? In the first scheme, r is represented as 10 and n as 101. If we encounter the symbols
                         10, how can we determine whether the symbols represent r or the first two symbols of i01,
                         which represent 2? The problem is that the sequence for r is a prefix of the sequence for
                                                                     12.4 Weighted Trees and Prefix Codes           611

n. This ambiguity does not occur in the second encoding scheme, suggesting the following
                  definition.

Definition 12.7   A set P of binary sequences (representing a set of symbols) is called a prefix code if no
                  sequence in P is the prefix of any other sequence in P.

Consequently, the binary sequences      111, 0, 1100, 1101,    10 constitute a prefix code for
                  the letters a, e, n, r, t, respectively. But how did the complete binary tree of Fig. 12.34
                  come about? To deal with this problem, we need the following concept.

Definition 12.8   If T is a complete binary tree of height h, then T is called a full binary tree if all the leaves
                  in T are at level h.

For the prefix code P = {111, 0, 1100,     1101,   10}, the longest binary sequence has length
EXAMPLE 12.17
                  4. Draw the labeled full binary tree of height 4, as shown in Fig. 12.35. The elements of P
                  are assigned to the vertices of this tree as follows. For example, the sequence 10 traces the
                  path from the root r to its right child cr. Then it continues to the left child of cr, where the
                  box (marked with the asterisk) indicates completion of the sequence. Returning to the root,
                  the other four sequences are traced out in similar fashion, resulting in the other four boxed
                  vertices. For each boxed vertex remove the subtree (except for the root) that it determines.
                  The resulting pruned tree is the complete binary tree of Fig. 12.34, where no “box” is an
                  ancestor of another “box.”

x

1     0      1     0
                                                                                                                Q
                      Figure 12.35

We turn now to a method for determining a labeled tree that models a prefix code, where
                  the frequency of occurrence of each symbol in the average text is taken into account     — in
                  other words, a prefix code wherein the shorter sequences are used for the more frequently
                  occurring symbols. If there are many symbols, such as all 26 letters of the alphabet, a
                  trial-and-error method for constructing such a tree is not efficient. An elegant construction
                  developed by David A. Huffman (1925-1999) provides a technique for constructing such
                  trees.
                      The general problem of constructing an efficient tree can be described as follows.
                      Let w), w2,..., W, be a set of positive numbers called weights, where w, < w2 <
                  -++<w,.If T = (V, E) is a complete binary tree with n leaves, assign these weights (in
612      Chapter 12 Trees

any one-to-one manner) to the n leaves. The result is called a complete binary tree for the
                            weights W|, W2,..., W,. The weight of the tree, denoted W(T), is defined as ea w;£(w,)
                            where, for each 1 <i <n, £(w,) is the level number of the leaf assigned the weight w,.
                            The objective is to assign the weights so that W(7)       is as small as possible. A complete
                            binary tree T’ for these weights is said to be an optimal tree if W(T’) < W(T) for any other
                            complete binary tree T for the weights.
                               Figure 12.36 shows two complete binary trees for the weights 3, 5, 6, and 9. For tree 7),
                            W(T)) = yt         w,€(w;) = 8+94+5+46) -2 = 46 because each leaf has level number 2.
                            In the case of 72, W(72) = 3-34+5-34+6-2+49-1 = 45, which we shall find is optimal.

9                                       9

6                                       6

3   95     6                                           v          5
                                                              3      5
                                        (T;)                (To)
                                                                                               1       2
                                      Figure 12.36                                             Figure 12.37

The major idea behind Huffman’s construction is that in order to obtain an optimal tree
                            T for the n weights w), w2, w3, ..., Wy, one considers an optimal tree 7’ for the n — |
                            weights w , + w2, W3,..., W,. (It cannot be assumed that w; + w2 < w3.) In particular,
                            the tree 7’ is transformed into 7 by replacing the leaf v having weight w; + w2 by a tree
                            rooted at v of height 1 with left child of weight w, and right child of weight w. To illustrate,
                            if the tree 7, in Fig. 12.36 is optimal for the four weights     1 + 2, 5, 6, 9, then the tree in
                            Fig. 12.37 will be optimal for the five weights 1, 2, 5, 6, 9.
                               We need the following lemma to establish these claims.

LEMMA 12.2                  If 7 is an optimal tree for the n weights w, < w2 <-+-- < w,, then there exists an optimal
                            tree 7’ in which the leaves of weights w, and w2 are siblings at the maximal level (in 7’).
                            Proof: Let v be an internal vertex of T where the level number of v is maximal for all
                            internal vertices. Let w, and w, be the weights assigned to the children x, y of vertex
                            v, with w, < wy. By the choice of vertex v, £(w,) = (wy) > &(wr), (wz). Consider the
                            case of w, < w,. (If w; = w,, then w) and w, can be interchanged and we would consider
                            the case of w2 < w,. Applying the following proof to this case, we would find that w, and
                            w2 can be interchanged.)
                               IfL(w,) > £(w)),let £(w,) = £(w)) + j, forsomej € Z*. Then w)é(wy) + w,£(wx) =
                            wi l(w,) + wy[l(w)) + J] = wi l(wy) + wyJ + w,l(w)) > wy l(w1) + wig +
                            wWyl(w)) = wi l(wx) + wyl(w)). So WT) = wi £(wy) + wy l(wy) + igtx w;€(w;) >
                            wy E(wy) tw, lw) + osha w,€(w;). Consequently, by interchanging the locations of
                            the weights w, and w,, we obtain a tree of smaller weight. But this contradicts the choice
                            of 7 as an optimal tree. Therefore £(w,) = €(w))      = €(w,). In a similar manner, it can be
                            shown that ¢(w,) = €(w2), so €(w,) = (wy) = £(w)) = €(w2). Interchanging the loca-
                            tions of the pair w), w,, and the pair w2, wy, we obtain an optimal tree 7’, where w), w2
                            are siblings.
                                                                             12.4 Weighted Trees and Prefix Codes                       613

From this lemma we see that smaller weights will appear at the higher levels (and thus
                  have higher level numbers) in an optimal tree.

THEOREM 12.8      Let 7 be an optimal tree for the weights               w, + w2, w3,...,          w,, Where w; < w2 < w3 <
                     -< w,. At the leaf with weight w; + w2 place a (complete) binary tree of height 1 and
                  assign the weights w;, w to the children (leaves) of this former leaf. The new binary tree
                  T; so constructed is then optimal for the weights w), w2, w3,...,                        Wp.
                  Proof: Let       7> be an optimal        tree for the weights      w), w2,...,     Wy,     where     the leaves for
                  weights w 1, wz are siblings. Remove the leaves of weights w), w2 and assign the weight
                  w + w2 to their parent (now a leaf). This complete binary tree is denoted 7; and W(7>) =
                  W(T3) + w, + w2. Also, W(T|)                = W(T)     + w) + wr. Since T is optimal, W(T)                  < W(T3).
                  If W(T)      < W(73), then W(7,)           < W(T)), contradicting the choice of T> as optimal. Hence
                  W(T) = W(T3)               and, consequently,    W(7T,) = W(T>).       So T, is optimal for the weights
                  Wy,    Wo,   ...,    Wr.

Remark. The preceding proof started with an optimal tree 7, whose existence rests on the
                  fact that there is only a finite number of ways in which we can assign n weights to a complete
                  binary tree with n leaves. Consequently, with a finite number of assignments there is at least
                  one where W(T) is minimal. But finite numbers can be large. This proof establishes the
                  existence of an optimal tree for a set of weights and develops a way for constructing such
                  a tree. To construct such a (Huffman) tree we consider the following algorithm.

Given the m (> 2) weights w;, wz,..., Wy», proceed as follows:
                                                                                                 :
                       Step 1: Assign the given weights, one each to a set § of ft isolated west
                       vertex is the root of a complete binary tree (of height 0 with a. weigh
                         to it.}
                         Step 2: While [5] > 1 perform the following:                                                         |
                                      a) Find two trees 7’, T’ in S with the smallest two rootot weh                     a,
                                         respectively.                                                                   :         at
                                      b) Create the new (complete binary) tree T* with toot weight w= oO
                                         w + w’ and having T, T’ as its left and right subtrees, respectively.
                                      c) Place T* in § and delete T and 7’. [Where {S|= 1, the one complete .
                                         binary tree in 8 is a Huffman tree.]

We now use this algorithm in the following example.

Construct an optimal prefix code for the symbols a, 0, g, u, y, z that occur (in a given
  EXAMPLE 12.18
                  sample) with frequencies 20, 28, 4, 17, 12, 7, respectively.
                     Figure 12.38 shows the construction that follows Huffman’s procedure. In part (b)
                  weights 4 and 7 are combined so that we then consider the construction for the weights 11,
                  12, 17, 20, 28. At each step [in parts (c)-(f) of Fig.              12.38] we create a tree with subtrees
                  rooted at the two smallest weights. These two smallest weights belong to vertices each of
                  which is originally either isolated (a tree with just a root) or the root of a tree obtained
                  earlier in the construction. From the last result, a prefix code is determined as

a: 01       o: 11       q: 1000       u: 00        y: 101            z: 1001.
614            Chapter 12 Trees

e               e          e        e          e             e
                                                            4               7          12       17      20               28                                                  51
                                                  (a)
                                                                      11                                                                                            23
                                                                                                                                                                                  28
                                                                                                                                         37                    "1
                                                                                       e        e          e             e                                                   12
                                                            4               7          12       17      20               28
                                                        b
                                                  (0)                                                                               17            20       4             7
                                                                                                23
                                                                                                                              (e)
                                                                                           14         12

e               e                                            e
                                                        17-20                          4        7                        28
                                                  (c)
                                                                           23

37
                                                                11                12                  /\
                                                                                                e
                                                            4               7                   28         17            20
                                                  (d)                                                                         (f)
                                              Figure 12.38

Different prefix codes may result from the way the trees 7, T’ are selected and assigned as
                                   the left or right subtree in steps 2(a) and 2(b) in our algorithm and from the assignment of
                                   0 or | to the branches (edges) of our final (Huffman) tree.

7, Using the weights 2, 3, 5, 10, 10, show that the height of
                         te                                                                          a Huffman tree for a given set of weights is not unique. How
                                                                                                     would you modify the algorithm so as to always produce a Huff-
1. For the prefix code given in Fig. 12.34, decode the sequences                                     man tree of minimal height for the given weights?
(a) 1001111101; (b) 10111100110001101; (c) 1101111110010.
                                                                                                     8. Let L,, for 1 < i <4, be four lists of numbers, each sorted
2. A code for {a, b, c,d, e} is given by a: 00 6:01 c: 101
                                                                                                     in ascending order. The numbers of entries in these lists are 75,
d:x10 e: yzl, where x, y, z € {0, 1}. Determine x, y, and z
                                                                                                     40, 110, and 50, respectively.
so that the given code is a prefix code.
                                                                                                        a) How many comparisons are needed to merge these four
3. Construct    an   optimal      prefix   code                 for        the   symbols                 .            ,                   :
                                                                                                        lists by merging L, and L»2, merging L3 and L4, and then
a,b,c,...,%,     j that occur (in a given sample) with respective                                                    :                        :        <
frequencies 78, 16, 30, 35, 125, 31, 20, 50, 80, 3.                                                     merging the two resulting lists?
4. How many leaves does a full binary tree have if its height is                                        b) How many comparisons are needed if we first merge L,
(a) 3? (b) 7? (c) 12? (d) h?                                                                            and Lo, then merge the result with 13, and finally merge this

5. Let T = (V, E) be a complete m-ary tree of height 4. This                                            result with a?
tree is called a full m-ary tree if all of its leaves are at level h.                                   c) In order to minimize the total number of comparisons in
If T is a full m-ary tree with height 7 and 279,936 leaves, how                                         this merging of the four lists, what order should the merging
many internal vertices are there in T?                                                                  follow?
6. Let T be a full m-ary tree with height / and v vertices. De-                                         d) Extend the result in part (c) to m sorted lists L, Lo,
termine / in terms of m and v.                                                                          woes Ene
                                                      12.5   Biconnected Components and Articulation Points   615

12.5
      Biconnected Components
        and Articulation Points
                    Let G = (V, E) be the loop-free connected undirected graph shown in Fig. 12.39(a), where
                    each vertex represents a communication center. Here an edge {x, y} indicates the existence
                    of a communication link between the centers at x and y.

>
                                    (a)                           (b)

Figure 12.39

By splitting the vertices at c and f, in the suggested fashion, we obtain the collection of
                    subgraphs in part (b) of the figure. These vertices are examples of the following.

Definition 12.9     A vertex v in a loop-free undirected graph G = (V, £) is called an articulation point
                    if «(G — v) > k(G); that is, the subgraph G — v has more components than the given
                    graph G.
                       A loop-free connected undirected graph with no articulation points is called biconnected.
                       A biconnected component of a graph is a maximal biconnected subgraph           — a bicon-
                    nected subgraph that is not properly contained in a larger biconnected subgraph.

The graph shown in Fig. 12.39 has the two articulation points, c and f, and its four
                    biconnected components are shown in part (b) of the figure.
                        In terms of communication centers and links, the articulation points of the graph in-
                    dicate where the system is most vulnerable. Without articulation points, such a system is
                    more likely to survive disruptions at a communication center, regardless of whether these
                    disruptions are caused by the breakdown of a technical device or by external forces.
                        The problem of finding the articulation points in a connected graph provides an applica-
                    tion for the depth-first spanning tree. The objective here is the development of an algorithm
                    that determines the articulation points of a loop-free connected undirected graph. If no
                    such points exist, then the graph is biconnected. Should such vertices exist, the resulting
                    biconnected components can be used to provide information about such properties as the
                    planarity and chromatic number of the given graph.
                        The following preliminaries are needed for developing this algorithm.
616      Chapter 12 Trees

Returning to Fig. 12.39(a), we see that there are four paths from a to e—namely,
                            (ha>smcoeQa7>cod>eBa>boc>ezand(4)a>obocod-e.
                            Now what do these four paths have in common? They all pass through the vertex c, one of
                            the articulation points of G. This observation now motivates our first preliminary result.

LEMMA 12.3                  Let G = (V, E) be a loop-free connected undirected graph with z € V. The vertex z is an
                            articulation point of G if and only if there exist distinct x, y € V with x #z, y # z, and
                            such that every path in G connecting x and y contains the vertex z.
                            Proof: This result follows from Definition 12.9. A proof is requested of the reader in the
                            Section Exercises.

Our next lemma provides an important and useful property of the depth-first spanning
                            tree.

LEMMA 12.4                  Let G = (V, E) be a loop-free connected undirected graph with T = (V, E£’) a depth-first
                            spanning tree of G. If {a, b} € E but {a, b} ¢ E’, then a is either an ancestor or a descendant
                            of b in the tree T.
                            Proof: From the depth-first spanning tree 7, we obtain a preorder listing for the vertices in
                            V. For all v € V, let dfi(v) denote the depth-first index of vertex v — that is, the position
                            of v in the preorder listing. Assume that dfi(a) < dfi(b). Consequently, a is encountered
                            before b in the preorder traversal of T, so a cannot be a descendant of b. If, in addition,
                            vertex @ is not an ancestor of b, then d is not in the subtree 7, of T rooted at a. But when we
                            backtrack (through 7,) to a, we find that because {a, b} € EF, it should have been possible
                            for the depth-first search to go from a to b and to use the edge {a, b} in T. This contradiction
                            shows that b is in 7,, so a is an ancestor of b.

If G = (V, E) isaloop-free connected undirected graph, let T = (V, E’) be a depth-first
                            spanning tree for G, as shown in Fig. 12.40. By Lemma 12.4, the dotted edge {a, b}, which
                            is not part of 7, indicates an edge that could exist in G. Such an edge is called a back edge
                            (relative to 7), and here a is an ancestor of b. [Here dfi(a) = 3, whereas dfi(b) = 6.| The
                            dotted edge {b, d} in the figure cannot exist in G, also because of Lemma 12.4. Thus all
                            edges of G are either edges in T or back edges (relative to T).

Root

Figure 12.40
                                                      12.5 Biconnected Components and Articulation Points      617

Our next example provides further insight into the relationship between the articulation
                   points of a graph G and a depth-first spanning tree of G.

In part (1) of Fig. 12.41 we have a loop-free connected undirected graph G = (V, E).
EXAMPLE 12.19
                   Applying Lemma 12.3 to vertex a, for example, we find that the only path in G from b
                   to ¢ passes through a. In the case of vertex d, we apply the same lemma and consider the
                   vertices a and h. Now we find that although there are four paths from a to h, all four pass
                   through vertex d. Consequently, vertices a and d are two of the articulation points in G.
                   The vertex / is the only other articulation point. Can you find two vertices in G for which
                   all connecting paths (for these vertices) in G pass through h?

(1) G=(V, €)   (2)      T=(V,E”)         (3)   G=\,E)           (4)    T"=(V,E")             (3)   G=WE)
   Figure 12.41

Applying the depth-first search algorithm, with the vertices of G ordered alphabetically,
                   in part (2) of Fig. 12.41, we find the depth-first spanning tree T’ = (V, E’) for G, where
                   a has been chosen as the root. The parenthesized integer next to each vertex indicates the
                   order in which that vertex is visited during the prescribed depth-first search. Part (3) of the
                   figure incorporates the three back edges (relative to 7, in G) that are missing from part (2).
                       For the tree T’, the root a, which is an articulation point in G, has more than one child.
                   The articulation point d has a child— namely, g — with no back edge from g or any of its
                   descendants (# and j) to an ancestor of d [as we see in part (3) of Fig. 12.41]. The same is
                   true for the articulation point /. Its child 7 has (no children and) no back edge to an ancestor
                   of h,
                       In part (4) of the figure, T” = (V, FE”) is the depth-first spanning tree for the vertices
                   ordered alphabetically once again, but this time vertex g has been chosen as the root. As
                   in part (2) of the figure, the parenthesized integer next to each vertex indicates the order in
                   which that vertex is visited during this depth-first search. The three back edges (relative to
                   T”, in G) that are missing from T” are shown in part (5) of the figure.
                       The root g of T” has only one child and g is not an articulation point in G. Further, for
                   each of the articulation points there is at least one child with no back edge from that child
                   or one of its descendants to an ancestor of the articulation point. To be more specific, from
                   part (5) of Fig. 12.41 we find that for the articulation point a we may use any of the children
                   b,c ori, but not f; for d that child is a; and for f the child is /.

The observations made in Example 12.19 now lead us to the following.
618      Chapter 12 Trees

LEMMA 12.5                  Let G = (V, E) be a loop-free connected undirected graph with T = (V, E’) a depth-first
                            spanning tree of G. If r is the root of 7, then r is an articulation point of G if and only ifr
                            has at least two children in 7.
                            Proof: If    has only one child — say, c —then all the other vertices of G are descendants of
                            c (andr) in 7. So if x, y are two distinct vertices of JT, neither of which is r, then in the
                            subtree 7,, rooted at c, there is a path from x to y. Since r is not a vertex in 7,., r is not
                            on this path. Consequently, r is not an articulation point in G — by virtue of Lemma 12.3.
                            Conversely, let r be the root of the depth-first spanning tree 7 and let ¢), cz be children of
                            r. Let x be a vertex in 7,,, the subtree of T rooted at c. Similarly, let y be a vertex in 7%,
                            the subtree of T rooted at cz. Could there be a path from x to y in G that avoids r? If so,
                            there is an edge {v,, v2} in G with v; in 7, and v2 in T,,. But this contradicts Lemma 12.4.

Our final preliminary result settles the issue of when a vertex, that is not the root of a
                            depth-first spanning tree, is an articulation point of a graph.

LEMMA 12.6                  Let G = (V, E) be a loop-free connected undirected graph with T = (V, E’) a depth-first
                            spanning tree for G. Let r be the root of T and let v€ V, v € r. Then v is an articulation
                            point of G if and only if there exists a child c of v with no back edge (relative to 7, in G)
                            from a vertex in 7,, the subtree rooted at c, to an ancestor of v.
                            Proof: Suppose that vertex v has a child c such that there is no back edge (relative to 7,
                            in G) from a vertex in 7, to an ancestor of v. Then every path (in G) from r to c passes
                            through v. From Lemma 12.3 it then follows that v is an articulation point of G.
                                To establish the converse, let the nonroot vertex v of T satisfy the following: For each
                            child c of v there is a back edge (relative to 7, in G) from a vertex in 7,., the subtree rooted
                            at c, to an ancestor of v. Now let x, y € V with x # vu, y # v. We consider the following
                            three possibilities:

1) If neither x nor y is a descendant of v, as in part (1) of Fig. 12.42, delete from T the
                                   subtree T, rooted at v. The resulting subtree (of T) contains x, y and a path from x
                                   to y that does not pass through v, so v is not an articulation point of G.

(1)

Figure 12.42
                                   12.5 Biconnected Components and Articulation Points         619

2) If one of x, y—say, x —is a descendant of v but y is not, thenx is a child of v ora
      descendant of a child c of v [as in part (2) of Fig. 12.42]. From the hypothesis there
      is a back edge (relative to T, in G) from some z € 7, to an ancestor w of v. Since
      x, z © T,, there is a path p, from x to z (that does not pass through v). Then, as neither
      w nor y is a descendant of v, from part (1) there is a path p2 from w to y that does
      not pass through v. The edges in p), p2 together with the edge {z, w} provide a path
      from x to y that does not pass through v — and once again, v is not an articulation
      point.
   3) Finally, suppose that both x, y are descendants of v, as in part (3) of Fig. 12.42. Here
      C1, ¢2 are children of vy — perhaps, with c, = c2 — and x is a vertex in 7,,, the subtree
      rooted at c,, while y is a vertex in 7,,, the subtree rooted at c2. From the hypothesis,
      there exist back edges {d|, aj} and {d2, az} (relative to T, in G), where d, d> are
      descendants of v and a,, a are ancestors of v. Further, there is a path p; from x to
      d, in T,, and a path p2 from y to d2 in T,,. As neither a; nor a2 is a descendant of v,
      from part (1) we have a path p (in 7) from a; to a2, where p avoids v. Now we can
      do the following: (i) Go from x to d; using path p); (ii) Go from d| to a; on the edge
      {d,, a1}; (iii) Continue to a2 using path p; (iv) Go from az to dz on the edge {a2, d2};
      and (v) Finish at y using the path p2 from d2 to y. This provides a path from x to y
      that avoids v so v is not an articulation point of G and this completes the proof.

Using the results from the preceding four lemmas, we once again start with a loop-free
connected undirected graph G = (V, E) with depth-first spanning tree 7. For v € V, where
v is not the root of 7, we let 7,,. be the subtree consisting of edge {v, c} (c a child of v)
together with the tree 7, rooted at c. If there is no back edge from a descendant of v in
T,,- to an ancestor of v (and v has at least one ancestor — the root of 7), then the splitting
of vertex v results in the separation of 7,,. from G, and v is an articulation point. If no
other articulation points of G occur in 7,,-, then the addition to 7,,, of all other edges in G
determined by the vertices in 7, (the subgraph of G induced by the vertices in 7,,-) results
in a biconnected component of G. A root has no ancestors, and it is an articulation point if
and only if it has more than one child.
    The depth-first spanning tree preorders the vertices of G. For x € V let dfi(x) denote the
depth-first index of x in that preorder. If y is a descendant of x, then dfi(x) < dfi(y). For y
an ancestor of x, dfi(x) > dfi(y). Define low(x) = min{dfi(y)|y is adjacent in G to either
x or a descendant of x}. If z is the parent of x (in 7), then there are two possibilities to
consider:
   1) low(x) = dfi(z): In this case 7,, the subtree rooted at x, contains no vertex that is
      adjacent to an ancestor of z by means of a back edge of T. Hence z is an articulation
      point of G. If 7, contains no articulation points, then 7, together with edge {z, x}
      spans a biconnected component of G (that is, the subgraph of G induced by vertex
      z and the vertices in 7, is a biconnected component of G). Now           remove    7, and the
      edge {z, x} from 7, and apply this idea to the remaining subtree of T.
   2) low(x) < dfi(z): Here there is a descendant of z in 7, that is joined [by a back edge
      (relative to 7, in G)]| to an ancestor of z.

To deal in an efficient manner with these ideas, we develop the following algorithm.
Let G = (V, E) be a loop-free connected undirected graph.
620         Chapter 12 Trees

Step 1: Find the depth-first spanning tree J for G according to a prescribed order.
                                    Let x1, x2, ..., Sy be the vertices of G preordered by 7. Then dfi(x;) = j for all
                                    L<jsn.
                                    Step 2: Start with x, and continue back to X,~1, Xn~2,.-., 3, X2, Xs, determining
                                    low(x;), for j =n, n—-1,n—-2,...,3, 2, 1, recursively, as follows:
                                            a) low'(x;) = min{dfi(z)|z is adjacent in G to x;}.
                                             b) If ci, ¢2, ..., Cm arethe children of x ;,thenlow(x;) = min{low (x;),
                                                     low(c1), low(c2), ... , low(cy,)}. [No problem arises here, for the ver-
                                                     tices are examined in the reverse order to the given preorder, Conse-
                                                     quently, if c is a child of p, then low(c) is determined before low(p).]
                                    Step 3: Let w, be the parent
                                                               of x; in TJ. Iflow(x,) = dfi(w,), then wis         an articulation
                                    point of G, unless w is the root of 7 and w; has nochildin T other than x;. Moreover,
                                    in either situation the subtree rooted at x, together with the edge {w,, x;} is part of
                                    a biconnected component of G.

We apply this algorithm to the graph G = (V, E) shown in part (i) of Fig. 12.43.
      EXAMPLE 12.20

(int)                         (lv)                   (v)

Figure 12.43

In part (ii) of the figure we have the depth-first spanning tree T = (V, E’) for G with
                               d as the root. (Here the order followed for the vertices of G is alphabetic.) Next to each
                               vertex v of 7 [in part (ii)| is the dfi(v). These labels tell us the order in which the vertices
                               of G are first visited.
                                  For step (2) of the algorithm we go in the reverse order from the depth-first search
                               and start with vertex h(= xg). Since {g, h} € E and h is not adjacent to any other vertex
                               of G we have low’ (h) = dfi(g)         [= dfi(x7)] = 7. Further, as A has no children, it follows
                               that low()   = low’(h#) = 7. This accounts for the label (7, 7) [= (ow’(h), low(h))] next
                                                                        12.5 Biconnected Components and Articulation Points                  621

to A in part (ii) of Fig. 12.43. Continuing next with g, and then f, we obtain the labels
                                    (6, 6) for g, and (1, 1) for f, since low’(g) = low(g) = 6 and low’(f) =low(f) = 1.
                                    Since {a, e}, {a, f} € E with dfi(e) = 4 and dfi(f) = 6, for vertex a we have low’(a) =
                                    min{4, 6} = 4. Then we find that low(a) = min{4, low(f)} = min{4, 1} = 1. Hence the
                                    label    (4, 1) for vertex a. Continuing    back through     e, c, b, and d, we obtain the labels
                                    (low’(x;), low(x;)) for i = 4, 3, 2, 1. Consequently, by applying step (2) of the algorithm
                                    we arrive at the tree in Fig. 12.43 (iii).
                                       In part (iv) of Fig. 12.43 the ordered pair next to each vertex v is (dfi(v), low(v)).
                                    Applying step (3) of the algorithm to the tree in part (iv), at this point we go in reverse
                                    order once again. First we deal with vertex h (= xg). Since g is the parent of h (in 7) and
                                    low(h) = 7 = dfi(g), g is an articulation point of G and the edge {h, g} is a biconnected
                                    component of G. Deleting the subtree rooted at g from 7, we continue with vertex g
                                    (= x7). Here f is the parent of g (in the tree T — h) and low(g) = 6 = dfi(f), so f is
                                    another articulation point — with edge {g, f} the corresponding biconnected component.
                                        Continuing now with the tree (T — h) — g, as we go from f to a to e, and then from c
                                    to b, we find no new articulation points among the four vertices a, e, c, and b. Since vertex
                                    d is the root of T and d has two children—namely,                the vertices b and e, it then follows
                                    from Lemma 12.5 that d is an articulation point of G. The vertices d, e, a, f induce the
                                    biconnected component consisting of the tree edges { f, a}, {a, e}, {e, d} and the back edges
                                    (relative to T, in G) {f, e} and { f, d}. Finally, the cycle induced (in G) by the vertices b, c
                                    and d provides the fourth biconnected component.
                                       Part (v) of Fig. 12.43 shows the three articulation points g, f, and d, and the four
                                    biconnected components of G.

b) Let G = (V, E) be a loop-free connected undirected
                              EXERCISES     12.5                               graph with |£| > 1. Prove that G has at least two vertices
                                                                               that are not articulation points.
  1, Find the articulation points and biconnected components
                                                                          5. If By), Boy...     B,    are the biconnected   components       of a
for the graph shown in Fig. 12.44.
                                                                        loop-free connected undirected graph G, how is x (G) related
                                                                        to x (B,), 1 <i < k? [Recall that x(G) denotes the chromatic
     a                                                                   number of G, as defined in Section 11.6.]
            b        C
                                                                          6. Let G = (V, E) be a loop-free connected undirected graph
                                                                         with biconnected components B,, B.,..., Bg. For 1 <i <8,
                          f                  J                          the number of distinct spanning trees for B, is n,. How many
            e
                                                                        distinct spanning trees exist for G?

d               q                                                    7. Let G = (V, E) bea loop-free connected undirected graph
                     9g            A          i                          with |V| > 3. If G has no articulation points, prove that G has
     Figure 12.44                                                        no pendant vertices.
                                                                          8. For the loop-free connected undirected            graph     G     in
2. Prove Lemma      12.3.                                              Fig. 12.43(i), order the vertices alphabetically.
3. Let 7 = (V, E) be
                    a tree with |V| =n > 3.                                    a) Determine the depth-first spanning tree 7 for G with e
   a) What are the smallest and the largest numbers of artic-                  as the root.
   ulation points that T can have? Describe the trees for each                 b) Apply the algorithm developed in this section to the tree
   of these cases.                                                             T in part (a) to find the articulation points and biconnected
   b) How many biconnected components                does   7 have in          components of G.
   each of the cases in part (a)?                                         9. Answer the questions posed in the previous exercise but
4. a) Let T = (V, E) be a tree. If v € V, prove that v is an           this time order the vertices as h, g, f, e, d, c, b, a and let c be
    articulation point of 7 if and only if deg(v) > 1.                  the root of T.
622              Chapter 12 Trees

10. LetG = (V, E) bea loop-free connected undirected graph,           11. In step (2) of the algorithm for articulation points, is it really
where V = {a, b, c,..., h, i, 7}. Ordering the vertices alpha-        necessary to compute low(x,) and low(x2)?
betically, the depth-first spanning tree T for G — with a as the      12, Let G = (V, E) be a loop-free connected undirected graph
root—is given in Fig. 12.45(i). In part (ii) of the figure the        withv eV.
ordered pair next to each vertex v provides (low’(v), low(v)).
                                                                          a) Prove that G — v = G — v.
Determine the articulation points and the spanning trees for the
biconnected components of G.                                              b) If v is an articulation point of G, prove that v cannot be
                                                                          an articulation point of G.
                                                                      13. If G = (V, E) is a loop-free undirected graph, we call G
                                                                      color-critical if x(G — v) < x(G) forall v € V. (We examined
                                                                      such graphs earlier, in Exercise 19 of Section 11.6.) Prove that
                                                                      a color-critical graph has no articulation points.
                                                                      14. Does the result in Lemma 12.4 remain true if T = (V, E’)
                                                                      is a breadth-first spanning tree for G = (V, E)?

Figure 12.45

12.6
        Summary and Historical Review
                                    The structure now called a tree first appeared in 1847 in the work of Gustav Kirchhoff
                                    (1824-1887) on electrical networks. The concept also appeared at this time in Geometrie
                                    die Lage, by Karl von Staudt (1798-1867). In 1857 trees were rediscovered by Arthur
                                    Cayley (1821-1895), who was unaware of these earlier developments. The first to call the
                                    structure a “tree,” Cayley used it in applications dealing with chemical isomers. He also
                                    investigated the enumeration of certain classes of trees. In his first work on trees, Cayley
                                    enumerated unlabeled rooted trees. This was then followed by the enumeration of unlabeled
                                    ordered trees. Two of Cayley’s contemporaries who also studied trees were Carl Borchardt
                                    (1817-1880) and Marie Ennemond Jordan (1838-1922).

Arthur Cayley (1821-1895)
                                                     12.6 Summary and Historical Review       623

The formula n”~2 for the number of labeled trees on n vertices (Exercise 21 at the end of
Section 12.1) was discovered in 1860 by Carl Borchardt. Cayley later gave an independent
development of the formula, in 1889. Since then, there have been other derivations. These
are surveyed in the book by J. W. Moon [10].
   The paper by G. Polya [11] is a pioneering work on the enumeration of trees and other
combinatorial structures. Polya’s theory of enumeration, which we shall see in Chapter 16,
was developed in this work. For more on the enumeration of trees, the reader should see
Chapter 15 of F. Harary [5]. The article by D. R. Shier [12] provides a labyrinth of several
different techniques for calculating the number of spanning trees for K2,,.
    The high-speed digital computer has proved to be a constant impetus for the discovery of
new applications of trees. The first application of these structures was in the manipulation of
algebraic formulae. This dates back to 1951 in the work of Grace Murray Hopper. Since then,
computer applications of trees have been widely investigated. In the beginning, particular
results appeared only in the documentation of specific algorithms. The first genera! survey
of the applications of trees was made in 1961 by Kenneth Iverson as part of a broader
survey on data structures. Such ideas as preorder and postorder can be traced to the early
1960s, as evidenced in the work of Zdzislaw Pawlak, Lyle Johnson, and Kenneth Iverson.
At this time Kenneth Iverson also introduced the name and the notation, namely            [x], for
the ceiling of a real number x. Additional material on these orders and the procedures for
their implementation on a computer can be found in Chapter 3 of the text by A. V. Aho,
J. E. Hopcroft, and J. D. Ullman     [1]. In the article by J. E. Atkins, J. S. Dierckman,    and
K. O’ Bryant   [2], the notion of preorder is used to develop an optimal route for snow removal.

Rear Admiral Grace Murray Hopper (1906-1992) salutes as she and Navy Secretary
                             John Lehman leave the U.S.S Constitution.
                                        AP/World Wide Photos
624   Chapter 12 Trees

If G = (V, E)isaloop-free undirected graph, then the depth-first search and the breadth-
                         first search (given in Section 12.2) provide ways to determine whether the given graph is
                         connected. The algorithms developed for these searching procedures are also important in
                         developing other algorithms. For example, the depth-first search arises in the algorithm
                         for finding the articulation points and biconnected components of a loop-free connected
                         undirected graph. If |V| =n and |£| = e, then it can be shown that both the depth-first
                         search and the breadth-first search have time-complexity O(max{n, e}). For most graphs
                         e >n, so the algorithms are generally considered to have time-complexity O(e). These
                         ideas are developed in great detail in Chapter 7 of S. Baase and A. Van Gelder [3], where
                         the coverage also includes an analysis of the time-complexity function for the algorithm (of
                         Section 12.5) that determines articulation points (and biconnected components). Chapter 6
                         of the text by A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1] also deals with the depth-first
                         search, whereas Chapter 7 covers the breadth-first search and the algorithm for articulation
                         points.
                              More on the properties and computer applications of trees is given in Section 3 of Chapter
                         2 in the work by D. E. Knuth [7]. Sorting techniques and their use of trees can be further
                         studied in Chapter   11 of A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1] and in Chapter 7
                         of T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein [4]. An extensive investigation
                         will warrant the coverage found in the text by D. E. Knuth [8].
                            The technique in Section 12.4 for designing prefix codes is based on methods developed
                         by D. A. Huffman [6].

David A. Huffman
                                        University of Florida, Department of Computer and Information Science and Engineering

Finally, Chapter 7 of C. L. Liu [9] deals with trees, cycles, cut-sets, and the vector spaces
                         associated with these ideas. The reader with a background in linear or abstract algebra
                         should find this material of interest.

REFERENCES
                             1. Aho, Alfred V., Hopcroft,     John   E., and Ullman, Jeffrey D. Data Structures and Algorithms.
                               Reading, Mass.: Addison-Wesley,          1983.
                             2. Atkins, Joe] E., Dierckman, Jeffrey S., and O’ Bryant, Kevin. “A Real Snow Job.” The UMAP
                               Journal, Fall no. 3 (1990): pp. 231-239.
                                                                                                     Supplementary Exercises              625

. Baase, Sara, and Van Gelder, Allen. Computer Algorithms: Introduction to Design and Analysis,
                                       3rd ed. Reading, Mass.: Addison-Wesley, 2000.
                                     . Cormen, Thomas H., Leiserson, Charles E., Rivest, Ronald L., and Stein, Clifford. Introduction
                                       to Algorithms, 2nd ed. Boston, Mass.: McGraw-Hill, 2001.
                                     . Harary, Frank. Graph Theory. Reading, Mass.: Addison-Wesley, 1969.
                                     . Huffman, David A. “A Method for the Construction of Minimum Redundancy Codes.” Pro-
                                      ceedings of the IRE 40 (1952): pp. 1098-1101.
                                     . Knuth, Donald E. The Art of Computer Programming, Vol. 1, 2nded. Reading, Mass.: Addison-
                                       Wesley, 1973.
                                     . Knuth, Donald E. The Art of Computer Programming, Vol. 3. Reading, Mass.: Addison-Wesley,
                                       1973.
                                     . Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
                                     . Moon, John Wesley. Counting Labelled Trees. Canadian Mathematical Congress, Montreal,
                                      Canada,    1970.
                                     . Polya, George. ““Kombinatorische Anzahlbestimmungen fiir Gruppen, Graphen und Chemis-
                                       che Verbindungen.” Acta Mathematica 68 (1937): pp. 145-234.
                                     . Shier, Douglas R. “Spanning Trees: Let Me Count the Ways.” Mathematics Magazine 73
                                       (2000): pp. 376-381.

tries a, and @, 41/2), for each 1 < i <n/2. For the resulting 2*~'
              SUPPLEMENTARY EXERCISES                                ordered pairs, merge sort the ith and (i + (n/4))-th ordered
                                                                     pairs, for each | <i <n/4. Now do a merge sort on the 7th
                                                                     and (i + (7/8))-th ordered quadruples, for each 1 <i <n/8.
1. LetG = (V, E) bea loop-free undirected graph with |V| =          Continue the process until the elements of L are in ascending
n, Prove that G is a tree if and only if P(G, A) = A(A — 1)""1.      order.
                                                                            a) Apply this sorting procedure to the list
  2. A telephone communication system is set up at a company
where 125 executives are employed. The system is initialized                   L: 11, 3, 4, 6, —5, 7, 35,
by the president, who calls her four vice presidents. Each vice
president then calls four other executives, some of whom in turn
                                                                                                             —2, 1, 23, 9, 15, 18, 2, —10, 5.
call four others, and so on. (Each executive who does make a
call will actually make four calls.)                                        b) If = 2*, how many comparisons at most does this pro-
    a) How many calls are made in reaching all 125 execu-                   cedure require?
    tives?                                                            5. Let G=(V, E) be a loop-free undirected graph.                          If
   b) How many executives,       aside from     the president, are   deg(v) > 2 for all v € V, prove that G contains a cycle.
   required to make calls?
                                                                       6. Let T = (V, E) be a rooted tree with root r. Define the re-
  3. Let T be a complete binary tree with the vertices of T
                                                                     lation& on V byx ‘Ky, forx, y € V, ifx = y orif x is on the
ordered by a preorder traversal. This traversal assigns the label
                                                                     path from r to y. Prove that & is a partial order.
1 to all internal vertices of T and the label 0 to each leaf. The
sequence of 0’s and 1’s that results from the preorder traversal
                                                                          7. Let T = (V, E) be a tree with V = {v,, v2,..., U_}, for
of T is called the tree’s characteristic sequence.
                                                                     n > 2. Prove that the number of pendant vertices in T is equal
    a) Find the characteristic sequence for the complete binary      to
    tree shown in Fig. 12.17.
    b) Determine the complete binary trees for the character-
    istic sequences
                                                                                           2+     S°         (deg(v,) — 2).
                                                                                                deg(v, )=3
         i)   1011001010100 and
        ii)   1011110000101011000.                                     8. Let G = (V, E) bea loop-free undirected graph. Define the
    c) What are the last two symbols in the characteristic se-       relation & on E as follows: If e;, e. € E, thene, ‘2 e2 ife, = &
    quence for all complete binary trees? Why?                       or if e; and e2 are edges of acycle C inG.
4. For ke Z*, let n = 2‘, and consider the list L: a), a,                  a) Verify that & is an equivalence relation on E.
a3,..., G,. To sort L in ascending order, first compare the en-             b) Describe the partition of E induced by &.
626            Chapter 12 Trees

G2
                         {a)

Figure 12.46

9. If G = (V, E£) is a loop-free connected undirected graph                 The first six rooted Fibonacci trees are shown in Fig. 12.47:
and a, b € V, then we define the distance from a to b (or from                    a)   Forn   > 1, let ,, count the number of leaves in 7,,. Find
b to a), denoted d(a, b), as the length of a shortest path (in G)                 and solve a recurrence relation for @,,.
connecting a and b. (This is the number of edges in a shortest
                                                                                  b) Let i, count the number        of internal vertices for the
path connecting a and } and is 0 when a = Bb.)
                                                                                  tree 7,,, where n > 1. Find and solve a recurrence relation
    For any loop-free connected undirected graph G = (V, E),
                                                                                  for i,.
the square of G, denoted G?, is the graph with vertex set V
(the same    as G) and edge set defined     as follows:   For distinct            c) Determine a formula for v,, the total number of vertices
a,beV, {a, b} is an edge in G? if d(a, b) < 2 (in G). In parts                    in T,, wheren > 1.
(a) and (b) of Fig. 12.46, we have a graph G and its square.
                                                                              12. a) The graph in part (a) of Fig. 12.48 has exactly one
      a) Find the square of the graph in part (c) of the figure.                  spanning tree—namely, the graph itself. The graph in
      b) Find G? if G is the graph K,.3.                                          Fig. 12.48(b) has four nonidentical, though isomorphic,
      c) If G is the graph K,,, for n > 4, how many edges are                     spanning trees. In part (c) of the figure we find three of
      added to G in order to construct G2?                                        the nonidentical spanning trees for the graph in part (d).
                                                                                  Note that 7) and 7; are isomorphic, but 7; is not isomor-
      d) For any loop-free connected undirected graph G, prove
                                                                                  phic to 7> (or 7;). How many nonidentical spanning trees
      that G* has no articulation points.
                                                                                  exist for the graph in Fig. 12.48(d)?
10. a) Let T = (V, E) be a complete 6-ary tree of height 8.
                                                                                  b) In Fig. 12.48(e) we generalize the graphs in parts (a),
      If T is balanced, but not full, determine the minimum and
                                                                                  (b), and (d) of the figure. For each n € Z", the graph G,, is
      maximum values for |V |.
                                                                                  Kn.

b) Answer part (a) if T = (V, E) is a complete m-ary tree                       If ¢, counts the number of nonidentical spanning trees
      of height A.                                                                for G,,, find and solve a recurrence relation for f,.
11. The rooted Fibonacci trees T,,n > 1, are defined recur-
sively as follows:                                                            13. Let G=(V, E) be the undirected connected “ladder
                                                                              graph” shown in Fig. 12.49. Forn > 0, let a, count the number
      1) 7; is the rooted tree consisting of only the root;                   of spanning trees of G, whereas b, counts the number of these
      2) T> is the same as 7; — it too is a rooted tree that consists         spanning trees that contain the edge {x,, yi}.
      of a single vertex; and                                                     a) Explain why a, = @,_) + dy.
      3) For n > 3, T, is the rooted binary tree with 7,,_, as its                b) Find an equation that expresses b, in terms of a, —, and
      left subtree and 7,2 as its right subtree.                                  By}.

T>           Ty

Figure 12.47
                                                                                                                                               Supplementary Exercises                         627

a                    a                      a                           a                                a

1             1                       2   (y
                                                                              1                  3.41   (                    3 (vy
                                                                                                                                1                              3

b                    b                      b                           b                                b
                              (a)              (b}                           (c)       T,                          Ty                                T3

a                                                        a

|                    3                           1                                             n

b                                                        b
                              (d)                                            (e)

Figure 12.48

c) Use the results in parts (a) and (b) to set up and solve a                                    a) How many maximal independent sets of vertices are
   recurrence relation for a,,.                                                                     there for each of the caterpillars in parts (i) and (ii) of
                                                                                                    Fig. 12.50?
               x     x2       x3                                                                    b) Fora € Z*, withn > 3, leta, count the number of maxi-
                                                                                                    mal independent sets in a caterpillar 7 whose spine contains
                                                                                                    n vertices. Find and solve a recurrence relation for a,,. [The
                                                                                                    reader may wish to reexamine part (a) of Supplementary
                                                                                                    Exercise 21 in Chapter 11.]

Ny    Yo       3                      Yn~1           Vn

Figure 12.49

Vy                                         V3
14. Let T = (V, E) be a tree where |V| = v and |E| = e. The
tree T is called graceful if it is possible to assign the labels                                                                      V2                                             V4
{1,2,3,..., v} to the vertices of 7 in such a manner that the
induced edge labeling — where each edge {i, j} is assigned the
label |i — j|, fori, 7 € {1, 2,3,...,           v}, i # j — results in the
                                                                                                         0)                      Spine = (Vy, Vo, V3, V4!
e edges being labeled by 1, 2,3,...,                 e.
    a) Prove that every path on n vertices, n > 2, is graceful.
   b) Forn € Z*, n > 2, show that K,,, is graceful.
    c) If7 = (V, E) isatree with4 < |V| < 6, show that T is                                                Wy                                             W3                              We
   graceful. (It has been conjectured that every tree is grace-                                                                 Ww2                                             W4
   ful.)
15. For an undirected graph G = (V, E) a subset of J of V is
called independent when no two vertices in / are adjacent. If,                                          (ii)                 Spine    =       \W4,    W 2, W3,          W4,   Ws;
in addition, 7 U {x} is not independent for eachx € V — J, then
we say that J is a maximal independent set (of vertices).                                               Figure 12.50
    The two graphs in Fig. 12.50 are examples of special kinds
of trees called caterpillars. In general, a tree T = (V, E) is a
caterpillar when there is a (maximal) path p such that, for all                                 16. In part (i) of Fig. 12.51 we find a graceful labeling of the
v € V, either v is on the path p or v is adjacent to a vertex on                                caterpillar shown in part (i) of Fig. 12.50. Find a graceful label-
the path p. This path p is called the spine of the caterpillar.                                 ing for the caterpillars in part (ii) of Figs. 12.50 and 12.51.
628              Chapter 12 Trees

19. For n > 0, we want to count the number of ordered rooted
                                                                         trees onn + | vertices. The five trees in Fig. 12.52(a) cover the
                                                                         case forn = 3.
                                                                         {Note: Although the two trees in Fig. 12.52(b) are distinct as
                                                                         binary rooted trees, as ordered rooted trees they are considered
                                                                         the same tree and each is accounted for by the fourth tree in
                                                                         Fig. 12.52(a).]
                                                                               a) Performing      a   postorder      traversal   of   each   tree   in
                                                                               Fig. 12.52(a), we traverse each edge twice      — once going
                                                                               down and once coming back up. When we traverse an
                                                                               edge going down, we shall write “1” and when we traverse
                                                                               one coming back up, we shall write “—1.” Hence the post-
       (ii)                                                                    order traversal for the first tree in Fig. 12,52(a) generates
                                                                               the list 1, 1, 1, -1, -1, —1. The list 1, 1, -1, -1,1, -1
      Figure 12.51
                                                                               arises for the second tree in part (a) of the figure. Find the
                                                                               corresponding lists for the other three trees in Fig. 12.52(a).
17. Develop an algorithm to gracefully label the vertices of a
                                                                               b) Determine the ordered rooted trees on five vertices that
caterpillar with at least two edges.
                                                                               generate the lists: () 1, —1, 1, 1, -1, 1, —1, —1; qi) 1, 1,
18. Consider the caterpillar in part (1) of Fig. 12.50. If we label            —1,—1, 1, 1, —1, —1; and (iii) 1, —1, 1, —1, 1, 1, —1, —1.
each edge of the spine with a | and each of the other edges                    How many such trees are there on five vertices?
with a 0, the caterpillar can be represented by a binary string.
                                                                               c) For n > 0, how many ordered rooted trees are there for
Here that binary string is 10001001 where the first 1 is for the
                                                                               n+   1 vertices?
first (left-most) edge of the spine, the next three 0’s are for the
(nonspine) edges at v2, the second | is for edge {v2, v3}, the two       20.   For n > 1, let t,, count the number of spanning trees for the

0’s are for the (nonspine) leaves at v3, and the final 1 accounts        fanonn + 1 vertices. The fan forn = 4 is shown in Fig. 12.53.
for the third (right-most) edge of the spine.                                  a) Show that t,4; = t+             ean t,, where n > | andfy = 1.
     We also note that the reversal of the binary string                       b) For n > 2, show that t,41 = 3t, — t)-1.
10001001 — namely, 10010001 — corresponds with a second
                                                                               c) Solve the recurrence relation in part (b) and show that
caterpillar that is isomorphic to the one in part (i) of Fig. 12.50.
                                                                               forn > 1, t, = Fo, the 2nth Fibonacci number.
      a) Find the binary strings for each of the caterpillars in
      part (ii) of Figs. 12.50 and 12.51.
      b) Can a caterpillar have a binary string of all 1’s?
       c) Can the binary string for a caterpillar have only two 1’s?
       d) Draw all the nonisomorphic caterpillars on five vertices.
       For each caterpillar determine its binary string. How many
       of these binary strings are palindromes?                                                        1    2     34
       e) Answer the question posed in part (d) upon replacing
                                                                                                      Figure 12.53
       “five” by “six.”
       f) For   n > 3, prove   that   the    number   of nonisomorphic   21. a) Consider the subgraph of G (in Fig. 12.54) induced by
      caterpillars on n vertices         is (1/2)(2"~3 + 2!"-9/71) =         the vertices a, b, c,d. This graph is called a kite. How many
       2r-4 4 2-2] = Qn-4 4 Qln/2)-2, (This was first estab-                 nonidentical (though some may be isomorphic) spanning

PALATES
       lished in 1973 by F. Harary and A. J. Schwenk.)                       trees are there for this kite?

(a)                                            (b)
                                      Figure 12.52
                                           Supplementary Exercises           629

b) How many nonidentical (though some may be isomor-
                   phic) spanning trees of G do not contain the edge {c, h}?
                   c) How many nonidentical (though some may be isomor-
                   phic) spanning trees of G contain all four of the edges {c, A},
                   {g, k}. {, p}, and {d, o}?
                   d) How many nonidentical (though some may be isomor-
                   phic) spanning trees exist for G?
                   e) We generalize the graph G as follows. For n > 2, start
                   with a cycle on the 2n vertices uv), v2,..., Va—1, Van.
                   Replace each of the n edges {v1, v2}, {v3, va}, ...,
                   {V2_—1, U2, } with a (labeled) kite so that the resulting graph
                   is 3-regular. (The case for n = 4 appears in Fig. 12.54.)
(G)           n   How many nonidentical (though some may be isomorphic)
                   spanning trees are there for this graph?
Figure 12.54
               13
    Optimization
    and Matching

Us         the structures of trees and graphs, the final chapter for this part of the text in-
                      troduces techniques that arise in the area of mathematics called operations research.
                 These optimization techniques can be applied to graphs and multigraphs that have a pos-
                 itive real number (in Sections 13.1 and 13.2) or a nonnegative integer (in Section 13.3),
                 called a weight, associated with each edge of the graph or multigraph. These numbers relate
                 information such as the distance between the vertices that are the endpoints of the edge, or
                 perhaps the amount of material that can be shipped from one vertex to another along an edge
                 that represents a highway or air route. With the graphs providing the framework, the opti-
                 mization methods are developed in an algorithmic manner to facilitate their implementation
                 on a computer. Among the problems we analyze are the determinations of:

1) The shortest distance between a designated vertex vg and each of the other vertices
                         in a loop-free connected directed graph.
                      2) Aspanning tree for a given graph or multigraph, where the sum of the weights of the
                         edges in the tree is minimal.
                      3) The maximum amount of material that can be transported from a starting point (the
                         source) to a terminating point (the sink), where the weight of an edge indicates its
                         capacity for handling the material being transported.

13.1
Dijkstra’s Shortest-Path Algorithm
                 We start with a loop-free connected directed graph G = (V, E). Now toeachedgee = (a, b)
                 of this graph, we assign a positive real number called the weight of e. This is denoted by
                 wt(e), or wt(a, b). Ifx, y € V but (x, y) ¢ E, we define wt(x, y) = oo.
                    For each e = (a, b) € E, wt(e) may represent (1) the length ofa road from a to b, (2) the
                 time it takes to travel on this road from a to b, or (3) the cost of traveling from a to b on
                 this road.
                     Whenever such a graph G = (V, E£) is given with the weight assignments described
                 here, the graph is referred to as a weighted graph.

In Fig. 13.1 the weighted graph G = (V, E) represents travel routes between certain pairs
EXAMPLE 13.1
                 of cities. Here the weight of each edge (x, y) indicates the approximate flying time for a
                 direct flight from city x to city y.

631
632   Chapter 13 Optimization and Matching

Figure 13.1

In this directed graph there are situations where wt(x, y) # wt(y, x) for certain edges
                      (x, y) and (y, x) in G. For example, wt(c, f) = 6 # 7 = wt(f, c). Perhaps this is due to
                      tailwinds. As a plane flies from c to f, the plane may be assisted by tailwinds that, in turn,
                      slow it down when it is flying in the opposite direction (from f toc).
                          We see that c, g € V but (c, g), (g, c) ¢ E, so wt(g, c) = wt(c, g) = o. This is also
                      true for other pairs of vertices. On the other hand, for certain pairs of vertices such asa, f,
                      we have wt(a, f) = 00 whereas wt(f, a) = 11, a finite number.

Our objective in this section has two parts. Given a weighted graph G = (V, E), for
                      each e = (x, y) € E, we shall interpret wt(e) as the length of a direct route (whether by
                       automobile,   plane, or boat) from x to y. For a, b € V, suppose that vj, v2,..., UU, EV
                       and that the edges (a, v1), (vy, V2), -.., (Un, b) provide a directed path (in G) from a
                      to b. The length of this path is defined as wt(a, v;) + wt(v,         v2) +--+     + Wt(v,, 0). We
                      write d(a, b) for the (shortest) distance from a to b—that is, the length of a shortest
                      directed path (in G) from a to b. If no such path exists (in G) from a to b, then we define
                      d(a, b) = oo. And for alla € V,d(a, a) = 0. Consequently, we have the distance function
                       d:V X V > Rt U (0, oo}.
                          Now fix vo € V. Then for all v € V, we shall determine
                          1) d(vo, v); and
                          2) a directed path from vg to uv [of length d(vo, v)] if d(vo, v) 1s finite.

To accomplish these objectives, we shal! introduce a version of the algorithm that was
                       developed by Edsger Wybe Dijkstra (1930-2002) in 1959. This procedure is an example of
                       a greedy algorithm, for what we do to obtain the best result /ocally (for vertices “close” to
                       vo) turns out to be the best result globally (for all vertices of the graph).
                          Before we state the algorithm, we wish to examine some properties of the distance
                       function d. These properties will help us understand why the algorithm works.
                          With vp € V fixed (as it was earlier), let S C V with vo € S, and S = V          —S.   Then we
                       define the distance from vg to S by

d(vo, S) = min{d(vo, v)}.
                                                                       ves

When d(up, S) < 00, thend (vy, S) is the length of a shortest directed path from Ug to a vertex
                       in §. In this case there will exist at least one vertex v»+) in S with d(vg, S) = d(vo, Vn):
                                                         13.1   Dijkstra’s Shortest-Path Algorithm        633

Here     P: (vo, v1), (V1, V2), .--, Um—1.       Um), (Um, Un+1)       is a shortest directed path (in G)
from vo to V,_ +1. SO, at this point, we claim that

1) vo, Vt, V2, ..., Um € S; and
   2)    P’: (vo, v1), (U1, U2), -.. . (Ug—1, Vg) iS a Shortest directed path (in G) from vg to vz,
         foreach 1 <k   <m.

(The proofs for these two results are requested in the first exercise at the end of this section.)
   From these observations it follows that

d(vo, S) = min{d(vp, u) + wt(u, w)},

where the minimum is evaluated over all u € S, w € S. Ifa minimum occurs for u = x and
w = y, then

d(vo, y) = d(vo, x) + wtx, y)
is the (shortest) distance from vg to y.
   The formula for d(vo, 5) is the cornerstone of the algorithm. We                         start with the set
So = {vo} and then determine

d(vo, So) = min {d(vp, u) + wt(u, w)}.
                                           Wwe   So

This gives us d(vo, So) = min,,-%, {wt(vo, w)} since Sp = {vo} and d(vp, vo) = O.Ifv, € So
and d(vp, So) = wt(vo, v), then we enlarge Sp to S; = So U {u;} and determine

d(vg, Sy) = min           {d(vp, u) + wt(u, w)}.
                                             HE!
                                           ue §;

This leads us to a vertex v> in S; with d(vo, S|) = d(vo. v2). Continuing the process, if
S; = {vo, v1, vo, ..., v;} has been determined and v;+.; € S; with d(vo, u;41) = d(vo, S:),
then we enlarge S; to S;.; = S; U {vj41}. We stop when we reach S,,_, = @ (wheren = |V])
or when d(vo, Si) = o© for some 0 <i <n                — 2.
   Throughout this process, various labels will be placed on each vertex v € V. The final
set of labels appearing on the vertices will have the form (L(v), u), where L(v) = d(vp, v),
the distance from vp to v, and u is the vertex (if one exists) that precedes v along a shortest
path from vg to v. That is, (4, v) is the last edge in a directed path from vo to v, and this
path determines d(vp, v). At first we label vo with (0, —) and all of the other vertices v
with the label (oo, —). As we apply the algorithm, the label on each v # vp will change
(sometimes more than once) from (co, —) to the final label (L(v), «) = (d(vo, v), 4), unless
d(vp, v) = co.
    Now that these preliminaries are behind us, it is time to formally state the algorithm.
    Let G = (V, E) be a weighted graph, with |V| = n. To find the shortest distance from a
fixed vertex vo to all other vertices in G, as well as a shortest directed path for each of these
vertices, we apply the following algorithm.

Dijkstra’s Shortest-Path Algorithm
        Step 1: Set the counter i = 0 and So = {ug}. Label vp with (0, —) and each v # vp
        with (co, —).
                ifn = 1, then V = {vo} and the problem is solved.
                ifn > 1, continue to step (2).
634         Chapter 13 Optimization and Matching

Step 2: For each v € S; replace, when possible, the label on v by the new label
                                  (L(v), y) where     ©

L(v) = min{L(v), LQ) + wttu, v)},
                                  and y is a vertex in §; that produces the minimum L(v). [When a replacement does
                                  take place, it is due to the fact that we can go from vg to v and travel a shorter distance
                                  by going along a path that includes the edge (y, v).]
                                  Step 3: If every vertex in 5; (for some 0 < i <n — 2) has the label (00, —), then the
                                  labeled graph contains the information we are seeking.
                                     If not, then there is at least one vertex v € 5; that is not labeled by (co, —), and
                                  we perform the following tasks:
                                           1) Select a. vertex vj; where L(v,+1) is a minimum (for all such v).
                                              There may be more than one such vertex, in which case we are free to
                                              choose among the possible candidates, The vertex vj is an element
                                               of §; that is closest to v9.
                                           2) Assign 5; U {vj24} to Sj44.
                                           3) Increase the counter i by 1.
                                                 if i = n — 1, the labeled graph contains the information we want.
                                              Ifi <n — 1, return to step (2).

We now apply this algorithm in the following example.

Apply Dijkstra’s algorithm to the weighted graph G = (V, E) shown in Fig. 13.1 in order
      EXAMPLE 13.2
                             to find the shortest distance from vertex c (= ug) to each of the other five vertices in G.
                                  Initialization:         (¢ = 0). Set Sp = {c}. Label c with (0, —) and all other vertices
                                                          in G with (co, —).
                                  First Iteration:        (So = {a, b, f, g, h}). Here i = 0    in step (2) and we find, for
                                                          example, that

L({a) = min{L(a), L(c) + wt(c, a)}
                                                                               = min{oo, 0+ co} = on,

whereas

L(f) = min{L(f), L(c) + wt(c, f)}
                                                                               = min{oo, 0+ 6} = 6.
                                                          Similar calculations yield L(b) = L(g) = oo and L(h) = 11. So
                                                          we label the vertex f with (6, c) and the vertex / with (11, c). The
                                                          other vertices in Sy remain labeled by (ox, —). [See Fig. 13.2(a).]
                                                          In step (3) we see that f is the vertex v; in Sg closest to vg, So we
                                                          assign to S; the set So U { f} = {c, f} and increase the counterf
                                                          to 1. Since i = 1 < 5 (= 6 — 1), we return to step (2).
                                  Second Iteration:       (S; = {a, b, g, h}). Now i = 1     in step (2), and for each v € S|
                                                          we set
                                                                          Liv) = min{L(v),   L(u) + wt(u, v)}.
                                               13.1   Dijkstra’s Shortest-Path Algorithm         635

Figure 13.2

This yields
                              L(a) = min{L(a), L(c) + wt(e, a), L(f) + wtf, a)}
                                       = min{fow,0+ o0,6+4+ 11} = 17,
                         so vertex a is labeled (17, f). In a similar manner, we find

L(b) = min{oo, 0+ co, 6+ co} = ~,
                                        L(g) = min{oo, 0+ cw, 64 9} = 15,
                                        L(h) = min{11, 0+ 11, 6 + 4} = 10.
                         [These results provide the labeling in Fig. 13.2(b).] In step (3) we
                         find that the vertex v2 is h because A € S; and L(h) is a minimum.
                         Then S2 is assigned S; U {h} = {c, f, h}, the counter is increased
                         to 2, and since 2 < 5, the algorithm directs us back to step (2).
     Third Iteration:    (Sy = {a, b, g}). With i = 2 in step (2) the following are now
                         computed:

L(a) = min{L (a), L{u) + wttu, a)}
                                          = min{17,0+ 0,64            11, 10+ 11} =17
                         (so the label on a is not changed);
                                  L(b) = min{oo,
                                             0+ co, 6+ co, 10+ co} = co

(so the label on b remains oo); and

L(g) = min{15,0+ 00,64 9, 10+ 4} = 14 <                     15,
                         so the label on g is changed to (14, #) because 14 = L(h) +
                         wt(h, g). Among the vertices in Sx, g is the closest to vg since
                         L(g) is a minimum. In step (3), vertex v3 is defined as g and
                         53 = So U {g} = {c, f, 2, g}. Then the counter 7 is increased to
                         3 < 5, and we return to step (2).
     Fourth Iteration:   (S3 = {a, b}). With i = 3, the following are determined in step
                         (2): L(a) = 17; L(b) = oo. (Thus no labels are changed during
636   Chapter 13 Optimization and Matching

this iteration.) We set vg = a and S4 = $3 U {a} = {c, fl h. g, a}
                                                  in step (3). Then the counter 7 is increased to 4 (< 5), and we
                                                  return to step (2).
                               Fifth Iteration:    (5S, = {b}). Here i = 4 in step (2), and we find L(b) = L(a) +
                                                  wt(a, b) = 17+ 5 = 22. Now the label onb is changed to (22, a).
                                                  Then vs = binstep (3), Ss issetto{c, f, h, g, a, b}, and? is incre-
                                                  mented to 5. Butnow thati = 5 = |V| — 1, the process terminates.
                                                  We reach the labeled graph shown in Fig. 13.3.

Figure 13.3

From the labels in Fig. 13.3 we have the following shortest distances from c to the other
                       five vertices in G:
                          1) d(c, f) = L(f)
                                          = 6.                              2) d(c, h) = L(h)=      10.
                          3)     d(c, g) = L(g) = 14.                       4) d(c, a) = L(a) = 17.
                          5)     d(c, b) = L(b) = 22.
                           To determine, for example, a shortest directed path from c to b, we start at vertex b,
                       which is labeled (22, a). Hence a is the predecessor of b on this shortest path. The label on
                       a is (17, f), so f precedes a on the path. Finally, the label on f is (6, c), so we are back
                       at vertex c, and the shortest directed path from c to b determined by the algorithm is given
                       by the edges (c, f), (f, a), and (a, b).

Now that we have demonstrated one application of this algorithm, our next concern is
                       the order of its worst-case time-complexity function f(m), where n = |V| in the weighted
                       graph G = (V, E). We shall estimate the worst-case complexity in terms of the number
                       of additions and comparisons that are made in steps (2) and (3) during execution of the
                       algorithm.
                           Following the initialization process in step (1), there are at most n — 1 iterations because
                       each iteration determines the next closest vertex to vg and        — 1 = |V — {vo}|.
                          If 0 <i <n       — 2, then in step (2) for that iteration [the (¢ + 1)st], we find that the fol-
                       lowing takes place for each v € S;:

1) When 0 <i <n — 2, we perform at most n — 1| additions to calculate

Liv) = min{L(v),     L{u) + wt(u, v)}

— one addition for each u € §;.
                                                  13.1   Dijkstra’s Shortest-Path Algorithm         637

2) We compare the present value of L(v) with each of the (possibly infinite) numbers
      L(v) + wt(u, v) — one for each u € S;, where|S;| <1” — 1—in order to determine
      the updated value of L(v). This requires at most n — 1 comparisons. Therefore, before
      we get to step (3) we have performed at most 2(n — 1) steps for each v € S; —a total
      of at most 2(n — 1)* steps for all v € S;.
          Continuing to step (3), we now must select the minimum from among at most
      n — 1 numbers L(v), where v € S;. This requires n — 2 additional comparisons
                                                                                 — in
      the worst case.
           Consequently, each iteration needs no more than 2(n — 1)? + (n — 2) steps in all.
      It is possible to have as many as n — | iterations, so it follows that

fn) <(n— D[2(n — 1)? + (n — 2) € O(n’).
    We shall close this section with some observations that can be used to improve the worst-
case time-complexity of this algorithm. First we should observe that for 0 <i <n — 2, the
(i + 1)st iteration of our present algorithm generated the (i + 1)st closest vertex to vo. This
was the vertex v;4;. In our example we found v; = f, v2 = h, v3 = g, vg = a, and vs = b.
    Second, note how much duplication we had when computing L(v). This is seen quite
readily in the second and third iterations of Example 13.2. We should like to cut back on
such unnecessary calculations, so let us try a slightly different approach to our shortest-
path problem. Once again we start with a weighted graph G = (V, EF) with |V| =” and
vo € V. We shall now let v; denote the ith closest vertex to ug, where 0 <i <n —1, §; =
{vp, U1,..-, U;}, and S; = V — S;. At the start we assign to each v € V the number Lo(v)
as follows:
                        Lo(vo) = 0        because d{vg, U9) =O           and
                         Lo(v) =o,        forv F vo.

Then fori > O and v € Si,    we define

Lizi(v) = min{L;(v), Li (v;) + wt(v;, v)},
where v, is a vertex for which L;(v;) is minimal: a vertex that is ith closest to vp. We find
that

Li4i(v) = min {d (vo, vj) + wt(v;, v)}.

Now let us see what happens at each of the (at most) m — 1 iterations when we employ
the definition of L,.;{v) that uses the vertex v;.
    For each v € S; we need only one addition [namely, L;(v;) + wt(v;, v)] and one compar-
ison [between L;(v) and £;(v;) + wt(v,, v)] in order to compute L;+4)(v). Since there are at
most n — | vertices in S;, this necessitates at most 2(n — 1) steps to obtain L;4)(v) for all
v © S,. Finding the minimum of {L;41(v)|v € S;} requires at most n — 2 comparisons, so
at each iteration we can obtain v;,; —a vertex v € S; where Lj41(v) is a minimum
                                                                              — in                    at
most 2(”n — 1) + (n — 2) = 3n — 4 steps. We perform at most n — 1 iterations, so we find
for this version of Dijkstra’s algorithm, that the worst-case time-complexity is O(n7).
    In order to find a shortest path from vp to each v € V, v # vo, we see that whenever
Li4i(v)   < L;(v), for any 0 <i    <n — 2,   we   need to keep track of the vertex            y € S; for
which L;41(v) = d(vo, y) + wt(y, v).
   Other implementations of Dijkstra’s algorithm use a data structure called a heap. For a
weighted graph G = (V, E), where         |V| =n   and |E| = m, we find, for example,            that the
binary heap implementation of this algorithm has worst-case time-complexity O(m log, n).
(This, and much more, is discussed on pp. 108-122 of the text by R. K. Ahuja, T. L. Magnanti,
638                   Chapter 13 Optimization and Matching

and J. B. Orlin [2]. The reader can also find more about various kinds of heaps on pp. 773-
                                                787 of this text. Another source for the implementation and running-time of Dijkstra’s
                                                algorithm is Section 24.3 (pp. 595-601) of the text by T. H. Cormen, C. E. Leiserson, R. L.
                                                Rivest, and C. Stein [7].)

EXERCISES 13.1

1. Let G = (V, E) be a weighted graph, where for each edge
e = (a, b) in E, wt(a, b) equals the distance from a to along
edge e. If (a, b) ¢ E, then wt(a, b) = ox.
   Fix vg €V and let SCV, with vp €S. Then for S =
V —S we define d(vp, S) = min,-s{d(vo, v)}. If Umar ES
and d(v, S) =d(vo, Um+i), then P: (v9, v1), (Vy, U2), ..-,
(Um—1; Um}, (Um, Um+1) 18 a shortest directed path (in G) from
Ug tO Um+4). Prove that                                                                                f

a)    vo,   Vv),   v2,   see   Um—l>    Vm    € Ay
                                                                                               Figure 13.4
   b)    P’: (ve, v1}, (U1, V2}, .-. , (Ue_1, Ue)             IS   a   shortest   di-
   rected path (in G) from vo to 1, foreach 1 < k <m.                                   4. Use the ideas developed at the end of the section to con-
                                                                                        firm the result obtained in (a) Example 13.2; and (b) part (a) of
2. a) Apply Dijkstra’s algorithm to the weighted graph G =
                                                                                        Exercise 2.
   (V, E) in Fig. 13.4, and determine the shortest distance
   from vertex a to each of the other six vertices in G. Here                           5. Prove or disprove the following for a weighted graph
   wt(e) = wt(x, y} = wt(y, x) for each edge e = {x, y} in E.                           G = (V, E), where V = {vo, v1, v2, ..., ¥,} and e, € E with
                                                                                        wt(e,) < wt(e) for all e € E, e # e,. If Dijkstra’s algorithm is
   b) Determine a shortest path from vertex a to each of the
                                                                                        applied to G, and the shortest distance d(vp, v,) is computed
   vertices c, f, andi.
                                                                                        for each vertex v,, | <i <n, then there exists a vertex v,, for
3. a) Apply Dijkstra’s algorithm to the graph shown in                                  some | < j <n, where the edge e, is used in the shortest path
   Fig. 13.1 and determine the shortest distance from vertex                            from vo to v,.
   a to each of the other vertices in the graph.
   b) Find a shortest path from vertex a to each of the vertices
   f.g,andh.

13.2
             Minimal Spanning Trees:
        The Algorithms of Kruskal and Prim
                                            A loosely coupled computer network is to be set up for a system of seven computers. The
                                            graph G in Fig. 13.5 models the situation. The computers are represented by the vertices
                                            in the graph; the edges represent transmission lines that are being considered for linking
                                            certain pairs of computers. Associated with each edge e in G is a positive real number wt(e),
                                            the weight of e. Here the weight of an edge indicates the projected cost for constructing that
                                                particular transmission line. The objective is to link all the computers while minimizing
                                                the total cost of construction. To do so requires a spanning tree 7, where the sum of the
                                                weights of the edges in 7 is minimal. The construction of such an optimal spanning tree can
                                                be accomplished by using the algorithms that were developed by Joseph Bernard Kruskal
Figure 13.5                                     (1928-— ) and Robert Clay Prim (1921— ).
                                                    Like Dijkstra’s algorithm, these algorithms are greedy; when each is used, at each step
                                                of the process an optimal (here minimal) choice is made from the remaining available data.
                                                Once again, if what appears to be the best choice /ocally (for example, for a vertex c and
                                       13.2. Minimal Spanning Trees: The Algorithms of Kruskal and Prim        639

the vertices near c) turns out to be the best choice globally (for all vertices of the graph),
               then the greedy algorithm will lead to an optimal solution.

We first consider Kruskal’s algorithm. This algorithm is given as follows.
                   Let G = (V, E) be a loop-free undirected connected graph, where |V| =         and each
               edge e is assigned a positive real number wt(e). To find an optimal (minimal) spanning tree
               for G, apply the following algorithm.

Kruskal's Algorithm
                    Step 1: Set the counter i = 1 and select an edge ¢; in G, where wt(e;) is as small as
                    possible.
                    Step 2: For 1 <i <n — 2, if edges e;, e2,..., ¢; have been selected, then select
                    edge e;,; from the remaining edges in G so that (a) wt(e;+;) is as small as possible
                    and (b) the subgraph of G determined by the edges @), ¢2,..., &;, e141 (and the
                    vertices they are incident with) contains no cycles.
                    Step 3: Replace i by i + 1.
                       if i = n — 1, the subgraph of G determined by edges e;, @2,..., &,—1 is con-
                    nected with » vertices and n — 1 edges, and is an optimal spanning tree for G.
                       ifi <n — 1, return to step (2).

Before establishing the validity of the algorithm, we consider the following example.

Apply Kruskal’s algorithm to the graph shown in Fig. 13.5.
EXAMPLE 13.3
                   Initialization:        (¢ = 1). Since there is a unique edge — namely, {e, g} — of small-
                                         est weight 1, start with T = {{e, g}}. (7 starts as a tree with one
                                         edge, and after each iteration it grows into a larger tree or forest.
                                         After the last iteration the subgraph 7 is an optimal spanning tree
                                         for the given graph G.)
                   First Iteration:      Among the remaining edges in G, three have the next smallest
                                         weight 2. Select {d, f}, which satisfies the conditions in step (2).
                                         Now T is the forest {{e, g}, {d, f}}, andi is increased to 2. With
                                         i = 2 <6, return to step (2).
                   Second Iteration:     Two remaining edges have weight 2. Select {d, e}. Now T is the
                                         tree {{e, g}, {d, f}, {d, e}}, and i increases to 3. But because
                                         3 < 6, the algorithm directs us back to step (2).
                   Third Iteration:    | Among the edges of G that are not in 7, edge {f, g} has min-
                                         imal   weight   2. However,     if this edge   is added   to T, the result
                                         contains a cycle, which destroys the tree structure being sought.
                                         Consequently, the edges {c, e}, {c, g}, and {d, g} are consid-
                                         ered. Edge {d, g} brings about a cycle, but either {c, e} or {c, g}
                                         satisfies the conditions in step (2). Select {c, e}. T grows to
                                         {{e, 2}, {d, f}. {d, e}, {c, e}} and i is increased to 4. Returning
                                         to step (2), we find that the fourth and fifth iterations provide the
                                         following.
640           Chapter 13 Optimization and Matching

Fourth Iteration:      T = {{e, gz}, {d, f}, {d, e}, {c. e}, {b, e}}; ( increases to 5.
                                   Fifth Iteration:       T = {{e, g}. {d, fF}, {d, e}, {c. e}. {b, e}, {a, b}}. The counter 7
                                                          now becomes 6 = (number of vertices in G) — 1. So T is an
                                                          optimal tree for graph G and has weight 1+2+2+4+3+4+
                                                          5=17.
                                  Figure 13.6 shows this spanning tree of minimal weight.

Figure 13.6
                                  Example 13.3 demonstrates that Kruskal’s algorithm does generate a spanning tree.
                              This follows from parts (a) and (d) of Theorem 12.5 since the resulting subgraph has n
                               (= |V|) vertices and n — 1 edges and is connected.       In general, if G = (V, E)     is a loop-
                              free weighted connected undirected graph and 7 is the subgraph of G that is generated by
                              Kruskal’s algorithm, then 7 has no cycles. Furthermore, 7 is a spanning subgraph of G.
                              For if v € V and v is not in 7, then we can add an edge e of G to T where e is incident
                              with v — and the resulting subgraph of G still contains no cycles. Finally, T 1s connected.
                              Otherwise    T has at least two components,     say 7; and 72, and since G       is connected we
                              could add to T an edge {x, y} from G where x is in 7; and y is in 7) —and no cycle would
                              be present in this subgraph. Consequently, the subgraph T of G is a connected spanning
                              subgraph of G with no cycles (or loops), so 7 is a spanning tree of G.
                                  The algorithm is greedy; it selects from the remaining edges an edge of minimal weight
                              that doesn’t create a cycle. The following result guarantees that the spanning tree obtained
                              is optimal.

THEOREM 13.1                   Let G = (V, E) be a loop-free weighted connected undirected graph. Any spanning tree
                               for G that is obtained by Kruskal’s algorithm is optimal.
                               Proof: Let |V| = , and let T be a spanning tree for G obtained by Kruskal’s algorithm. The
                               edges in T are labeled e), e2, .. . , €n—1, according to the order in which they are generated
                               by the algorithm. For each optimal tree T’ of G, define d(7’) = k if k is the smallest positive
                               integer such that T and T’ both contain e), e2...., ex—1, but ex ¢ T’.
                                   Let 7; be an optimal tree for which d(7|) = r is maximal. Ifr = n, then T = T; and the
                               result follows. Otherwise, r <n — 1 and adding edge e, (of T) to 7; produces the cycle C,
                               where there exists an edge e? of C that is in 7) but not in T.
                                   Start with tree T,. Adding e, to 7; and deleting e’, we obtain a connected graph with n
                               vertices and n — 1 edges. This graph is a spanning tree, 72. The weights of 7; and 7> satisfy
                               wt(7>) = wt(T;) + wt(e,) — wt(e’).
                                  Following the selection of e;, e2, ..., ¢y—1 in Kruskal’s algorithm, the edge e, is chosen
                               so that wt(e,) is minimal and no cycle results when e, is added to the subgraph H of G
                               determined by e;, €2,..., €-~1. Since e, produces no cycle when added to the subgraph
                               H, by the minimality of wt(e,) it follows that wt(e,) > wt(e,). Hence wt(e,) — wt(e,) < 0,
                               so wt(T>) < wt(7}). But with 7, optimal, we must have wt(7>) = wt(7)), so T> is optimal.
                                   The tree 7> is optimal and has the edges e;, e2,..., €y-1, €y In common with T, so
                               d(T.) >r+1>r=d(T), contradicting the choice of T;. Consequently, 7; = T and the
                               tree T produced by Kruskal’s algorithm is optimal.

We measure the worst-case time-complexity for Kruskal’s algorithm by making the fol-
                               lowing observations. Given a loop-free weighted connected undirected graph G = (V, E),
                             13.2 Minimal Spanning Trees: The Algorithms of Kruskal and Prim                  641

where |V| = n and |£| = m > 2, we can use the merge sort of Section 12.3 to list (and rela-
bel, if necessary) the edges in E as e€), €2,..., @m, where wt(e;) < wt(e2) <--- < wt(e,).
The number of comparisons needed to do this is O(m log, m). Then once we have the edges
of G listed in this order (of nondecreasing weights), step (2) of the algorithm is carried out
at most m — | times
                 — once               for each of the edges e2, €3,..., Em.
    For each edge e;, 2 <i <m, we must determine whether e; causes the formation of a
cycle in the tree, or forest, that we have developed (after considering the edges e), €2,...,
e;-|). This can be done for each edge in a constant [that is, O(1)] amount of time, if we
use additional data structures, such as the component flag data structure. Unfortunately, the
updating of this data structure cannot be performed in a constant amount of time. However,
it does turn out that all of the work needed for cycle detection can be carried out in at most
O(n log, n) steps."
    Consequently, we shall define the worst-case time-complexity function f, form > 2, as
the sum of the following:
    1) The total number of comparisons needed to sort the edges of G into nondecreasing
       order, and
    2) The total number of steps that are carried out in step (2) in order to detect the formation
       of a cycle.
   Unless G is atree, it follows that |V| = n < m = |EF| because G is connected. As a result,
nlog,n <m log, mand f € O(m log, m).
   A measure in terms of 1, the number of vertices in G, can also be given. Heren                       — 1<m
because the graph is connected, and m < (5)                = (1/2)(n)(m — 1), the number of edges in
K,. Consequently m log, m <n log, n? = 2n? log, n, and we can express the worst-case
time-complexity of Kruskal’s algorithm as O(n? log, 7), although this is less precise than
O(m log, m).

A second technique for constructing an optimal tree was developed by Robert Clay Prim.
In this greedy algorithm, the vertices in the graph are partitioned into two sets: processed
and not processed. At first only one vertex is in the set P of processed vertices, and all
other vertices are in the set N of vertices to be processed. Each iteration of the algorithm
increases the set P by one vertex while the size of set N decreases by one. The algorithm
is summarized as follows.
    Let G = (V, E) be a loop-free weighted connected undirected graph. To obtain an op-
timal tree T for G, apply the following procedure.

oe                 ol '”          Prim‘s Algorithm
      Step 1: Set the counter i, = 1 and place an arbitrary vertex v; € V into set P, Define

Step2: For.l <i                  1} where |V| =n, let P = {v, m2, ..., w}, T = (en, &2,
      ...,@-p}pand N = V~ P. Add to T a shortest edge (an edge of minimal weight)
      in G that connects a vertex x in P with a vertex y (= 0,41) in N. Place y in P and
      delete it from N.

‘For more on the analysis of the segment dealing with cycle detection, we refer the reader to Chapter 8 of the
text by S. Baase and A. Van Gelder [3] and to Chapter 4 of the text by E. Horowitz and S. Sahni [17].
642           Chapter 13 Optimization and Matching

Step 3: Increase the counter by 1.
                                    .     iff =n, the subgraph of G determined
                                                                            by the edges ¢;, ¢2, ...,         ¢,—-1 is connected
                                  _ with n vertices and n ~ 1 edges and is an optimal tree for G.
                                           ift <n, return to step (2).

We use this algorithm to find an optimal tree for the graph in Fig. 13.5.

Prim’s algorithm generates an optimal tree as follows.
      EXAMPLE 13.4
                                  Initialization:           i=1;P=       _ ;N    = {b,c,d,e, f,g}; T = 9.
                                 First Iteration:           T = {{a, b}};       oa   b};N = {c, d, e, f, g}; i =2.
                                  Second Iteration:         7 = {{a, b}, bs e}};P = {a, b, e};N = {c, d, f, g}, i = 3.
                                  Third Iteration:          7 = {{a, b}, {b, e}, {e. g}}; P = {a, b. e, g}:
                                                            N = {c, d, f\; i =4.
                                  Fourth Iteration:         7 = {{a, b}, {b, e}. fe, g}, {d, e}     = {a, b, e, g, d};
                                                            N = {c, f};i  =5.
                                  Fifth Iteration:          T = {{a, b}, {b, e}, {e, gh, {d. e}, {f, ahh: P = {a, b, e, 9. d, fy:
                                                            N = {c}, i = 6.
                                  Sixth Iteration:          T = {{a, b}, {b, e}, {e, _ }, {d, e}, (f, gh, {c, 2h}:
                                                            P ={a,b,e, g,d, f,c}=V;N=6%;1=7=|V|.                   Hence T is
Figure 13.7                                                  n optimal spanning he of weight 17 for G, as seen in Fig. 13.7.

Note that the minimal spanning tree obtained here differs from that in Fig. 13.6. So this
                              type of spanning tree need not be unique.

We shall only state the following theorem,           which establishes the validity of Prim’s
                              algorithm. The proof is left for the reader.

THEOREM 13.2                  Let G = (V, E) be a loop-free weighted connected undirected graph. Any spanning tree
                              for G that is obtained by Prim’s algorithm is optimal.

Note that at each iteration Prim’s algorithm always grows a tree. Some iteration(s) of
                              Kruskal’s algorithm may grow a forest (which is not a tree). Also observe that Prim’s
                              algorithm can be started at any vertex in the graph.

We conclude this section with a few words and references about the worst-case time-
                              complexity for Prim’s algorithm. When the algorithm is applied to a loop-free weighted
                              connected undirected graph G = (V, E), where |V| = 7 and |E| = m, the typical imple-
                              mentations require O (n*) steps. (This can be found in Chapter 7 of A. V. Aho, J. E. Hopcroft,
                              and J. D. Ullman [1]; in Chapter 8 of S. Baase and A. Van Gelder [3]; and in Chapter 4
                              of E. Horowitz and S. Sahni [17].) Other implementations of the algorithm have improved
                              the situation so that it requires O(m log, n) steps. (This is discussed in the articles by R. L.
                              Graham and P. Hell [16]; by D. B. Johnson [18]; and by A. Kershenbaum and R. Van Slyke
                              [19].) The worst-case time-complexities for various heap implementations are discussed in
                                                                             13.2   Minimal Spanning Trees: The Algorithms of Kruskal and Prim          643

Section 13.5 of R. V. Ahuja, T. L. Magnanti, and J. B. Orlin [2] and Section 23.2 of T. H.
                                     Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein [7].

5. a) Answer Exercise 4 under the additional requirement that
                          EXERCISES 13.2                                                       the system includes a highway directly linking Evansville
                                                                                               and Indianapolis.
1. Apply Kruskal’s and Prim’s algorithms to determine mini-
mal spanning trees for the graph shown in Fig. 13.8.                                              b) If there must be a direct link between Fort Wayne and
                                                                                                  Gary in addition to the one connecting Evansville and In-
                                                                                                  dianapolis, find the minimum number of miles of highway
             a        2              b               2              C
                                                                                                  that must be constructed.
                      3                              3
         3                       3                                      3                  6. Let G = (V, E) be a loop-free weighted connected undi-
                      2                              2                                    rected graph. For n € Z*, let {e), e2,..., €,} be a set of edges
         d                               <                              f                 (from £) that includes no cycle in G. Modify Kruskal’s al-
                      5                              4
         3                       1                                      3                 gorithm in order to obtain a spanning tree of G that is mini-
                                                                                          mal among all the spanning trees of G that include the edges
           g         3               h               3              i                     C1, €2,-4+5 ne
         Figure 13.8                                                                        7. a) Modify Kruskal’s algorithm to determine an optimal
                                                                                               tree of maximal weight.
  2. Let G = W4, the wheel on four spokes. Assign the weights                                     b) Interpret the information of Exercise 4 in terms of the
1, 1, 2, 2, 3, 3, 4, 4 to the edges of G so that (a) G has a unique                               number of calls that can be placed between pairs of cities
minimal spanning tree; (b) G has more than one minimal span-                                      via the adoption of certain new telephone transmission
ning tree.                                                                                        lines. (Cities that are not directly linked must communi-
3. Let G = (V, E) be a loop-free weighted connected undi-                                        cate through one or more intermediate cities.) How can the
rected graph with T = (V, E’}, a minimal spanning tree for G.                                     seven cities be minimally connected and allow a maximum
For v, w € V, is the path from v to w in T a path of minimum                                      number of calls to be placed?
weight in G?                                                                                8. Prove Theorem      13.2.

4, Table 13.1 provides information on the distance (in miles)                             9. Let G = (V, E) be a loop-free weighted connected undi-
between pairs of cities in the state of Indiana.                                          rected graph, where for each pair of distinct edges e,, e2 € E,
   A system of highways connecting these seven cities is to be                            wt(e,) # wt(€2). Prove that G has only one minimal spanning
constructed. Determine which highways should be constructed                               tree.
so that the cost of construction             is minimal.       (Assume      that the
cost of construction of a mile of highway is the same between
every pair of cities.)

Table 13.1

Fort                                   South
                                                Bloomington | Evansville | Wayne | Gary | Indianapolis | Bend

Evansville                           119                   —             —            —               —         —
                    Fort Wayne                       174                      290            —           —                —         —
                   Gary                              198                      277           132          —                —         —
                    Indianapolis                      51                      168           121          153              —         —
                    South Bend                       198                      303             79          58              140       —
                    Terre Haute                       58                      113           201          164               71       196
644          Chapter 13 Optimization and Matching

13.3
               Transport Networks:
       The Max-Flow Min-Cut Theorem
                             This section provides an application for weighted directed graphs to the flow of acommodity
                             from a source to a prescribed destination. Such commodities may be gallons of oil that flow
                             through pipelines or numbers of telephone calls transmitted in a communication system.
                             In modeling such situations, we interpret the weight of an edge in the directed graph as a
                             capacity that places an upper limit on, for example, the amount of oil that can flow through
                             a certain part of a system of pipelines. These ideas are expressed formally in the following
                             definition.

Definition 13.1        Let N = (V, E) be a loop-free connected directed graph. Then N is called a network, or
                             transport network, if the following conditions are satisfied:
                                  a) There exists a unique vertex a € V with id(a), the in degree of a, equal to 0. This
                                     vertex a is called the source.
                                  b) There is a unique vertex z € V, called the sink, where od(z), the out degree of z,
                                     equals 0.
                                  c) The graph N is weighted, so there is a function from E to the set of nonnegative integers
                                     that assigns to each edge e = (v, w) € E a capacity, denoted by c(e) = c(v, w).

EXAMPLE 13.5           The graph in Fig. 13.9 is a transport network. Here vertex a is the source, the sink is at
                              vertex z, and capacities are shown          beside each edge. Since c(a, b) + c(a, g) =354+7=
                              12, the amount of the commodity being transported from a to z cannot exceed 12. With
                             c(d, z) + c(h, z) = 5+ 6 = 11, the amount is further restricted to be no greater than 11.
                             To determine the maximum amount that can be transported from a to z, we must consider
                             the capacities of all edges in the network.

5                            5

a              AS           6   2       z

7            5               6
                                                                           g        -       h
                                                            Figure 13.9

The following definition is introduced to assist us in solving this problem.

Definition 13.2         If N =(V, E) is a transport network, a function f from EF to the nonnegative integers is
                              called a flow for N if

a) f(e) <c(e)     for each edge e € EF; and
                                  b) for each v € V, other than the source a or the sink z, Do .cy f(w. v) =
                                    S- ncy £(v, w). (If there is no edge (v, w), then f(v, w) = 0.)
                                                          13.3. Transport Networks: The Max-Flow Min-Cut Theorem                645

The first property specifies that the amount of material transported along a given edge
                  cannot exceed the capacity of that edge. Property (b) enforces a conservation condition:
                  The amount of material flowing into a vertex v must equal the amount that flows out from
                  this vertex. This is so for all vertices except the source and the sink.

For the networks in Fig. 13.10, the label x, y on each edge e is determined so that x = c(e)
EXAMPLE 13.6
                  and y is the value assigned for a possible flow f. The label on each edge e satisfies f(e) <
                  c(e). In part (a) of the figure, the “flow” into vertex g is 5, but the “flow” out from that
                  vertex is 2 + 2 = 4. Hence the function f is not a flow in this case. The function f for part
                  (b) does satisfy both properties, so it is a flow for the given network.

D       4,1         d                                 D      4,2         d
                               5, 3            -                 5,2                 5, 3           -                 5,4
                           a          AS,2 %6,4           42,1         z       a            A5,2 46,3          42,2         z

7,5             _                 6,5                 7,5            -                 6,4

(a)              g       5,2         h                  (b)            g      5,3         A
                   Figure 13.10

Definition 13.3   Let f be a flow for a transport network N = (V, E).

a) An edge e of the network is called saturated if f(e) = c(e). When f(e) < c(e), the
                       edge is called unsaturated.
                    b) Ifa is the source of N, then val(f) = Sev                   f (a, v) is called the value of the flow.

For the network in Fig. 13.10(b), only the edge (h, d) is saturated. All other edges are
EXAMPLE 13.7
                  unsaturated. The value of the flow in this network is

val(f) = )> f(a, v) = f(a, b) + fla, g) =34+5=8.
                                                    veV

But is there another flow f; such that val(f)) > 8? The determination of a maximal flow
                  (a flow that achieves the greatest possible value) is the objective of the remainder of this
                  section. To accomplish this, we observe that in the network of Fig. 13.10(b),

Yo f@ v) =34+5=8=444=
                                             fd dt fia = >- flv, 2).
                               veV                                                                       veV

Consequently, the total flow leaving the source a equals the total flow into the sink z.

The last remark in Example 13.7 seems like a reasonable circumstance, but will it occur
                  in general? To prove the result for every network, we need the following special type of
                  cut-set.

Definition 13.4   If N = (V, E) is a transport network and C is a cut-set for the undirected graph associated
                  with N, then C is called a cut, or an a-z cut, if the removal of the edges in C from the
                  network results in the separation of a and z.
646         Chapter 13 Optimization and Matching

Each of the dotted curves in Fig. 13.11 indicates a cut for the given network. The cut C,
      EXAMPLE 13.8
                            consists of the undirected edges {a, g}, {b, d}, {b, g}, and {b, h}. This cut partitions the
                            vertices of the network into the two sets P = {a, b} and its complement P = {d, g, A. z},
                            so C; is denoted as (P, P). The capacity ofa cut, denoted c(P, P), is defined by

c(P, P) =               >             c(v, w),
                                                                                         veP
                                                                                         weP

the sum of the capacities of all edges           (v, w), where                     v € P and w € P. In this example,
                            c(P, P) = cla, g) +c(b, d) + c(b, h) = 7+ 44+ 6 = 17. [Considering the directed edges
                            (from P to P) in the cut C; = (P, P)—namely, (a, g), (b, d), (b, h) —we find that the
                            removal of these edges does not result in a subgraph with two components. However, the
                            removal of these three edges eliminates all possible directed paths from a to z and no proper
                            subset of {(a, 2), (b, d), (b, #)} has this separating property. ]

b            ;    4               d
                                                                  5                       a,                       5
                                                                             7            _-*

47       é
                                                        a              val                     6          A?                 Z
                                                                  “7             \
                                                        C,-7                             ~-~\
                                                                  7                       5         ‘              6

g                      CG         A
                                                        Figure 13.11

The cut C2 induces the vertex partition Q = {a, b, g}, O = {d, h, z} and has capacity
                            c(Q, QO) = c(b, d)+ c(h, hh) + c(g, hh) = 44645 = 15.
                               A third cut of interest is the one that induces the vertex partition S = {a, b, d, g, h},
                             S = {z}. (What are the edges in this cut?) Its capacity is 11.

Using the idea of the capacity of a cut, this next result provides an upper bound for the
                             value of a flow in a network.

THEOREM 13.3                 Let f be a flow inanetwork N = (V, E).IfC = (P, P)is any cut in NV, then val( f) cannot
                            exceed c(P, P).
                            Proof: Let vertex a be the source in N and vertex z the sink. Since id(a) = 0, it follows that
                            for all w € V, f(w, a) = 0. Consequently,

val(f) = D0 fav) => fav) - SO fw, a).
                                                            veV                          veV                           weV

By property (b) in the definition of a flow, for all x € P, x #a, Vey f(x, v) -
                             diwey fw, x) = 0.
                              13.3. Transport Networks: The Max-Flow Min-Cut Theorem              647

Adding the results in the above equations yields

val(f) = ly          fla,v)- Do fw, ° +>                  bs fix,v)—-               > ftw, »
                 vev              wev                xeP         vey                wey
                                                     xFa

=) few-             YO ft,»
                xeP               xeP
                vey               wev

“| y fawn+                   fon] “|                     fwot               foo
                  ‘ep               veP                     xe                    EF
Since

>     f(x, v)    and    »          f(w, x)
                            xeP                       xeP
                            veP                       weP

are summed over the same set of all ordered pairs in P < P, these summations are equal.
Consequently,

val(f)= D> f@.v)— D> flu, x).
                                        xeP            xEP
                                        veP            weP

For all x, we V, f(w, x) > 0,s0

>    f(w,x)>0O     and     val(f)<     >    f(x,u) <          S>    c(x, v) = c(P, P).
        xeP                                    xeP                     xeP
        weP                                    veP                     veP

From Theorem 13.3 we find that in a network N, the value for any flow is less than or
equal to the capacity of any cut in that network. Hence the value of the maximum flow cannot
exceed the minimum capacity over all cuts in a network. For the network in Fig. 13.11, it
can be shown that the cut consisting of edges (d, z) and (A, z) has minimum capacity 11.
Consequently, the maximum flow f for the network satisfies val(f) < 11. It will turn out
that the value of the maximum flow is 11. How to construct such a flow and why its value
equals the minimum capacity among all cuts will be dealt with in this section.
   However, before we deal with this construction, let us note that in the proof of Theo-
rem 13.3, the value of a flow is given by

val(f)= D> fx,v)— Do f(w,x),
                                        xeP            xeP
                                        ve P           weP

where (P, P) is any cut in N. Therefore, once a flow is constructed in a network, then for
any cut (P, P) in the network, the value of the flow equals the sum of the flows in the
edges directed from the vertices in P to those in P minus the sum of the flows in the edges
directed from the vertices in P to those in P.
   This observation leads to the following result.
648      Chapter 13. Optimization and Matching

COROLLARY 13.1            If f is a flow in a transport network N = (V, E), then the value of the flow from the source
                          a is equal to the value of the flow into the sink z.
                          Proof: Let P = {a}, P = V — {a}, and Q = V — {z},O = {z}. From the above observation,

d) f@.v-— YO fw, x) =val(f) = DO fO.v)- DO fw, y).
                                   xeP                 xeP                        yeo              yeQ
                                   vEeP                weP                        veO              weQ

With P = {a} andid(a) = 0, we find that )) cp.weP f(w,x)=           YS eP f(w, a) = 0. Sim-
                          ilarly, forQ = {z} and od(z) = 0, it follows that
                                                                        )) <9 wep f(W. 9) = Viyeg fF, y) =0.
                          Consequently,

Y> fa. v= do flav) =val(f) = Y> fo.v) = SO FO. 2).
                                    xeP                 veP                       yeQ              yeQ
                                    VE    P                                       vE    Oo

and this establishes the corollary.

Additional properties of flows and cuts in a network are given in the following corollaries.

COROLLARY 13.2            Let f be a flow in a transport network N = (V, E) and let (P, P) be a cut, where val( f) =
                          c(P, P). Then f is a maximum flow for the network N and (P, P) is a minimum cut [that
                          is, (P, P) has minimum capacity in NJ.
                          Proof: If | is any flow in N, then from Theorem 13.3 it follows that

val( fi) <c(P, P) = val(f),
                          so f is a maximum flow. Likewise, for any cut(Q, Q) in N we have

c(P, P) = val(f) <¢(Q, Q),
                          so (P, P) is a minimum cut— again,         by Theorem 13.3.

COROLLARY 13.3            If f is a maximum flow in a transport network N = (V, E) and (P, P) is a minimum cut,
                          then val(f) <c(P,      P).
                          Proof: The proof of this corollary is requested in the Section Exercises.

COROLLARY 13.4            For a transport network N = (V, E), let f be a flow in N and let (P, P) be a cut. Then
                          val(f) = c(P, P) if and only if

a) f(e) = c(e) for each edge e = (x, y), where x € P and ye P, and
                            b) f(e) = 0 for each edge e = (v, w), where v € Pandwe            P.
                                                                   13.3 Transport Networks: The Max-Flow Min-Cut Theorem                          649

Furthermore, under these circumstances, f is a maximum                                     flow and (P, P) is a minimum
                     cut.
                     Proof: The proof of this corollary is requested in the Section Exercises.

We turn now to the main results of the section— namely, (1) developing an efficient
                     algorithm to solve the Maximum Flow-Minimum Cut (Max-Flow Min-Cut) problem, and
                     (2) establishing the Max-Flow Min-Cut Theorem. The algorithm we introduce was initially
                     presented in the work of Lester R. Ford, Jr., and Delbert Ray Fulkerson. Basically, it is
                     designed to increase the flow in a transport network JN iteratively, until no further increase
                     is possible.

In order to motivate the concepts we shall need here, we start by considering the following
                     example.

Let N = (V, E) be the transport network shown in part (i) of Fig. 13.12. Examining the
EXAMPLE 13.9
                     edges (b, z) and (g, z), we see that the value of the flow is 6+ 2 = 8. But neither of
                     these two edges is saturated, nor is any other edge in N, so we shall try to increase the
                     present flow. To do so, consider a directed path from a to z—for example, the path p
                     made up of the edges (a, b) and (b, z) [as in part (ii) of the figure]. For this path we
                     define A, = mineep{c(e) — f(e)} = min{8 — 4, 8 — 6} = min{4, 2} = 2. This tells us that
                     the flow in each of these two edges can be increased by 2, with the conservation of flow still
                     maintained. The resulting network, in part (iii) of the figure, now has flow value 8 + 2 = 10.

|                                                   b                                      b                    |            b
                                                    8,6                    8,8               8,6                8,8        |       87       8,8
                                             a                                   Zz   a                               Z|a                          Z

6,                  54         |       6,       5,5
                                                          d 6,3                                    d 6,5                   |            6     ‘
                                                                     g                Ww)                   9              | (vu)                       |
                                            po
                            b                                                                         D                    |
                                  8,6                                                        8,6
                                        Z    NU                                       a               4,1             z|

6,4   re               5,2                     >;           54         |
               ‘)                            (wv)                    ale              (ui)            129                      |
                                                                                                                           J
           Figure 13.12

So far, so good. Now let us try to increase the flow again. This time we use the di-
                     rected path p; from a to z as shown in part (iv) of Fig. 13.12. This path comprises the
                     edges (a, d), (d, g), and(g, z) and here A,, = minz<p, {c(e) — f(e)} = min{6 — 4, 6 — 3,
                     5 — 2} = min{2, 3, 3} = 2. The resulting network, with the adjustment A,, Pi = 2, is shown
                     in Fig. 13.12(v) and it has flow value 12.
                            Now, at this point, any possible directed a@-z path in N [of Fig. 13.12(v)] must use either
                     edge (a, d) or edge (b, z), both of which are saturated — that is, c(e) = f (e). Consequently,
                     it may seem that the current flow of 12 is the maximum flow possible.
                         If, however, we disregard the directions on the edges of the network, it is possible to find
                     other paths from a to z. Consider one such path — the path p2 shown in part (v1) of the figure.
                     This undirected path comprises the edges {a, b}, {b, d}, {d, g}, and {g, z}. Here we define
650          Chapter 13   Optimization and Matching

Ap,    = MiNeep, {Ae}, where A, = c(e) — f (e) for the forward edges (a, b), (d, g), (g, 2),
                               and A, = f(e) for the backward edge going from b to d [the opposite of the direction for
                               edge (d, b) in N]. So A,, = min[{8 — 6, 6 — 5, 5 — 4} U {1}] = 1. This increase of one
                               unit of flow is added to the flow for each of the three forward edges and subtracted from
                               the flow for the one backward edge. The resulting final network appears in part (vii) of
                               Fig. 13.12, where we see that by decreasing the flow from d to b by one unit (of flow) we
                               have been able to redirect this one unit from d to g and then from g to z. So now the flow
                               value for N is 12 + 1 = 13 and this is the maximum flow value possible     — for the edges
                               (b, z) and (g, z) are saturated.

What has taken place in Example 13.9 now leads us to the following.

Definition 13.5          Let N = (V, E) be a transport network and let

a=    vo,   el,   VI,   €2,   U2,    sets   Un—1:   en,   Up   =z

be an alternating sequence of vertices and edges, where the edges are taken from the undi-
                               rected graph associated with N. This sequence is called a semipath.'
                                     For 2<i<n-—1,           if e; = (v;-1. v;) —that is, e; is the directed edge in N from 1;_, to
                               v; — then e; is called aforward edge. Inthe case where2 < j <n — lande; = (u;, vj-1) —
                               that is, (vj_1, vj) 1s the actual directed edge in N — then e; is called a backward edge.

When all of the edges in a semipath are forward edges (in N), then we have a directed
                               path from a to z in N. It is only when there is at least one backward edge (from N) that the
                               path in the associated undirected graph is a semipath.
                                   Our next idea takes the notion of the semipath one step further.

Definition 13.6          Let f be a flow in a transport network N = (V, E). An f-augmenting path p is a semipath
                               (from a to z) where for each edge e on p we have
                                                                   f(e) <c(e),              fore a forward edge

f(e) > 0,                for e a backward edge.

From Definition 13.6 we see that along an f-augmenting path p the flow on a forward
                               edge can be increased, for no such forward edge is saturated. [Note that here we could
                               have f(e) = 0.| For each backward edge the flow is positive, so it can be decreased (and
                               redirected elsewhere). The maximum possible increase or decrease is given in terms of Ag,
                               the tolerance on an edge e, as we learn in the following.

Definition 13.7          Let p be an f-augmenting path in a transport network N = (V, E). For each edge e on the
                               semipath p,
                                                                      c(e) — f(e),                fore a forward edge
                                                            Ae =      fle),                       for e a backward edge.

The quantity A, is often called the tolerance on edge e.

* Some authors use the term chain or quasi-path in place of semipath.
                                                       13.3 Transport Networks: The Max-Flow Min-Cut Theorem                   651

Note that in Definition 13.7 we have A, > 0 for each edge e on p. Further, we find that
                     Ap» = MiNecp{A-} is the maximum increase (for the forward edges) and maximum decrease
                     (for the backward edges) that we can have and still maintain the conservation condition in
                     part (b) of Definition 13.2.

Our next result formally establishes what was described in Definition 13.7 and the para-
                     graph that followed.

THEOREM 13.4         Let f be a flow in a transport network N = (V, E) and let p be an f-augmenting path in
                     N with A,     = minecy{A-}. Define fi: E >              N by

f(e)+A,,             ep,      ea torward edge
                                             fite)= 4 f(e)-—A,,              © € p, e a backward edge
                                                        fre),                     p.
                                                                             eCE,e¢
                     Then f; is a flow in N with val(f;) = val(f) + Ap.
                     Proof: From the definition of A, we have 0 < fi(e) < c(e), for each e € E. So f; satisfies
                     condition (a) of Definition 13.2. To establish condition (b) of Definition 13.2 for f;, we only
                     need to consider those v € V where v is on the semipath p and v # a, z. So let {v;, v} and
                     {v, v;42} be the two edges in p that are incident with v. When we consider the net change
                     at v, we see in the four cases of Fig. 13.13 that this change is 0. Consequently, f; satisfies
                     condition (b) and is a flow.

V,            Vv      Vie?     V,        v           Vi42     V,         Vv        Vind   V,         Vv       Vie2

The A, additional            The A, additional              The A, units of flow        The A, units of flow
                 units of flow that come       units of flow that come        redirected from y, into v   redirected from v, into v
                 into v along {v,, v) are      into valong (v,, v} are        are counterbalanced by      are counterbalanced by
                 counterbalanced by            counterbalanced by             the A, units from v,,2      the A, units that leave
                 the A, units that leave       the A, units from v,.2         that are redirected         v along {Y, V,,2)-
                 v along {¥, Vj4>).            that are redirected            away from v.
                                               away from v.

Figure 13.13

To determine val(f;) we consider e; = (vo, vi) = (a, v1), the first edge on the
                     f-augmenting path p. Then e; is adjacent from the source a and it follows from part (b) of
                     Definition 13.3 that val( f,) = ev fila, v) =     vey} fila, v) + fila, vi) =
                       vev—tur} F(a, v) + f(@, vi) + Ap          =          vey    Fla, v)+ Ap     = val(f) + Ap.

The result of Theorem 13.4 now helps us in characterizing a maximum flow ina                      transport
                     network.

THEOREM 13.5         Let N = (V, E) be a transport network with flow f. The flow f is a maximum flow in NV
                     if and only if there exists no f-augmenting path in N.
                     Proof: If f is a maximum flow in N, then it follows from Theorem 13.4 that there is no
                     f-augmenting path in NV.
                         Conversely, if there is no f -augmenting path in NV, consider the set of all partial semipaths
                     in N that start at a. We call each of these edge sets a partial semipath because it cannot
652         Chapter 13 Optimization and Matching

reach z, without contradicting the hypothesis.   Let P be the union | of the vertices in these
                            partial semipaths. Then a € P, and P # Was z € P. Further, (P, P) is acut for N and,
                                 i) ife = (u, w) € E withu € P, we P, then f(e) = c(e)
                                                                                   — otherwise, w € P;
                                ii) if e =(u, w) € E with we P,ue P, then f(e) = 0—otherwise, f(e) > 0 and
                                    ueP.

Consequently, from Corollary 13.4, it follows that f is a maximum flow.

We now turn to the main result of the section.

THEOREM 13.6                The Max-Flow Min-Cut Theorem. For a transport network N = (V, £), the maximum flow
                            value that can be attained in N is equal to the minimum capacity over all cuts in the network.
                            Proof: Let f be a flow for which val(f) is a maximum. Then let (P, P) be the cut con-
                            structed as in Theorem 13.5. We know from Corollary 13.4 that val(f) = c(P, P). And
                            then Corollary 13.2 shows us that (P, P) is a minimum cut.

Now that we have dispensed with the necessary theory it is time to develop an efficient
                            way of determining a maximum flow and minimum cut for a given transport network N.,
                            The discussion in Example 13.9 might suggest that we should simply find f-augmenting
                            paths and use them to continue increasing the existing flow in V. However, this may prove
                            to be tedious and inefficient as our next example demonstrates.

Consider the transport network N = (V, E) in Fig. 13.14(i), where the initial flow is
      EXAMPLE 13.10
                            given as f(e) = 0 for each e € E. The capacities for the edges are c(a, b) = c(b, z) =
                            c(a, d) = c(d, z) = 10 and c(d, b) = 1. If we use the directed paths (a, b), (b, z) and then
                            (a, d), (d, z) as successive f-augmenting paths, we attain the flow in part (ii) of the fig-
                            ure after two iterations. Here we find that val(f) = 20 and this is a maximum flow since
                            20 = c(P, P) for P = {a}. If, instead, we start with the directed path (a, d), (d, b), (b, z)
                            and then the semipath {a, b}, {b, d}, {d, z} as our first two successive f-augmenting paths,
                            we attain the flow in Fig. 13.14(iii) where val(f) = 2. Should we continue to alternately
                            use these two f-augmenting paths, we will have to perform 20 iterations in total before we
                             attain the flow in part (ii) of the figure.

(1)                                                      (11)

Figure 13.14

What do we observe here? The directed paths (a, b), (b, z) and (a, d), (d, z) each have
                             two edges, while the directed path (a, d), (d, b), (b, z) and the semipath {a, b}, {b, da},
                             {d, z} each have three edges. Further, note how the first iteration in Example 13.9 used a
                                                     13.3. Transport Networks: The Max-Flow Min-Cut Theorem   653

directed path with two edges, the second iteration a directed path with three edges, and the
                  third iteration a semipath of four edges.

The observations made in Example 13.10 suggest that for each iteration it is more ef-
                  ficient to use an f-augmenting path with the least number of edges. This idea was used
                  by Jack Edmonds and Richard M. Karp in the development of an algorithm to find such
                  f -augmenting paths. Their approach uses a breadth-first search and, as in Prim’s algorithm,
                  the vertex set V is partitioned as P U P, where P accounts for the processed vertices.
                  However, before we can deal with this algorithm we need one additional idea.

Definition 13.8   Let N = (V, E) be a transport network with flow f. Start to construct a breadth-first span-
                  ning tree T for N (as an undirected graph) using the source a as the root, and a prescribed
                  order for the other vertices in V. While the sink z is not a vertex in T, let e = {v, w} be the
                  newest edge appended in the construction of 7, with v in the present tree and w the new
                  vertex. The edge e is called usable if

e = (v, w) with f(e) < ce),          or
                                                   e = (w, v) with f(e) > 0.

Now we are ready to deal with the following algorithm. Here the input is a transport
                  network N = (V, E) with flow f. The output is an f-augmenting path p, with a mini-
                  mum number of edges, if one exists; otherwise, the output is a minimum cut (P, P) with
                  c(P, P) = val(f).

The Edmonds-Karp Algorithm
                      Step 1: Place the source a into set P (thus initializing the set of processed vertices.)
                      Assign the label ( , 1) to a and set the counter 7 = 2.
                      Step 2: While the sink z is not in P
                                     _ If there is a usable edge in N
                                 .           Let e = {v, w} be usable with labeled vertex v having
                                                  the smallest counter assignment
                                             If w is unlabeled
                                               “Label w with (v, i)
                                                 Place w in P
                                                 Increase the couriter i by 1.
                                      Else                                 ,
                                             Return the minimum cut (P, P).
                       Step 3: If z is in P, start with z and backtrack to a using the first component of the
                       vertex labels. (This provides an f-augmenting path p with the smallest number of
                       edges.)

At this point we have finally arrived at the algorithm for determining a maximum flow
                  and minimum cut for a transport network N = (V, £). The original version of this algorithm
                  was developed by Lester R. Ford, Jr., and Delbert Ray Fulkerson. Here we shall incorporate
                  the previous algorithm by Jack Edmonds and Richard M. Karp in order to improve the
                  efficiency of the original algorithm.
654         Chapter 13 Optimization and Matching

As with the preceding algorithm, the input is again a transport network N = (V, E). The
                            output is a maximum flow and minimum cut for NV.

The Ford-Fulkerson Algorithm
                                  Step 1: Define the initial flow f on the edges of N by fe)= 0 for each ¢ € E,
                                  Step 2: Repeat
                                               Apply the Edmonds-Karp algorithm to determine
                                                   an f-augmenting path p.
                                                Let    Ap   = Mittep {Ac}.
                                                   For each e € p
                                                       If e is a forward edge
                                                             Fle)i= fe) + Ap
                                                       Else (e is a backward edge)
                                                             fle):= fle} -
                                           Until no f-augmenting path p can be found in NV.
                                           Return the maximum flow /f.
                                  Step 3: Return the minimum cut (P, P) (from the last application of theEdmonds.
                                  Karp algorithm, where no further f-augmenting path could be constructed).

Before demonstrating the use of the Ford-Fulkerson and Edmonds-Karp algorithms we
                            state one last corollary and some related comments. The proof of the corollary is left as an
                            exercise.

COROLLARY 13.5               Let N = (V, E) be a transport network where for each e € E, c(e) is a positive integer.
                            Then there is a maximum           flow f for N, where f(e) is a nonnegative integer for each
                            edge e.

The definition of transport network and flow (in a transport network) may be modified to
                             allow nonnegative real-valued capacity and flow functions. If the capacities in a transport
                             network are rational numbers, then the Ford-Fulkerson algorithm will terminate and attain
                             a maximum        flow and minimum      cut. When     some   capacities   are irrational, however,   the
                            original algorithm developed by L. R. Ford, Jr., and D. R. Fulkerson may not terminate
                            correctly. Furthermore, Ford and Fulkerson [14] showed that their algorithm could result
                            in a flow  — but that the flow need not be a maximum flow. When irrational capacities do
                            arise, the modification given by Edmonds and Karp [11] terminates and attains a maximum
                             flow. Further,    the Edmonds-Karp       algorithm   can be implemented        so that its worst-case
                             time-complexity is O(nm*), where n = |V|, m = |E|, for N = (V, E). (For more on the
                             time-complexity of this algorithm one should examine Section 6.5 of Ahuja, Magnanti, and
                             Orlin [2] and Chapter 26 of Cormen, Leiserson, Rivest, and Stein [7].)

_            :                  .
      EXAMPLE   13.11                      - lkerson
                             Use: the Ford-Fu               and E Edmonds-Karp    algorithms to find a maximum        flow for the
                             transport network in Fig. 13.15(i).
                                 In the transport network N = (V, EF) [of Fig. 13.15(i)], each edge is labeled with a pair
                             of nonnegative integers x, y, where x is the capacity of the edge and y = 0 indicates an
                             initial flow. This follows from step (1) of the Ford-Fulkerson algorithm.
                                                                            13.3. Transport Networks: The Max-Flow Min-Cut Theorem                                                     655

b 6,0   j 5,0     k                                        b(a, 2)
                                                                                              J(B, 5)
                                                     a,    1)                                                                 a                              d                          Zz
                                                                                      +o-                          @          e—___»>_____@_—___>__—_-#
                                                           \                        d(a, 3)                  2(d, 6)                     3,0                         5,0
         g     60      h4,0m8,0n                            gia, 4)

‘i                                  vali = 0 | (i)                                                                          (ii                                               Ay = 3
               66,0    7   5,0   k                                   bla, 2)           f(b, 4)          kj, 6)

a1)                            ath, 7)                                   a                              d                           z
                                                                                                             z(d,      9)                           4,   0           5,    3
                                                                                                                              3,0
         g     60      h40m8,0n                                 g(a, 3)              Ag, 5)         mth, 8)                          g         6,0               A

(iv)                               val(fy} = 3 |   WW)                                                                     (vi)                                              A, = 2

b6,0        5,0   k                                   b{a, 2)           (b, 4)           kj, 6)

a( , 1)                                               2(n, 10) | a                                                                  z
                                                                                                                                                                                7, 0
                                                                                                                              3,2
         g     62      h40m8,0n                                 g(a, 3)              A(g, 5)                n(m, 9)                  g         62                h40mM8,0n

(vii)                               val(f) = 5 | (wii)                                                                     (ix)                                               A, =1

66,0    j/ 5,0    k                                  bla, 2)           f(b, 3)            k(j, 4)

g(h, 7)              h(d, 6)               nim, 9)                                               h4,1m         8,197

(x)                                val(f} =6 |     «x)                                                                     (xit)                                             A,=2

Figure 13.15

When applying the Edmonds-Karp algorithm the prescribed order for the vertices V —
                                {a} will be alphabetic. Applying this algorithm for the first time, in step (1) we label a with
                               ( ,1), place a in P, and set the counter i to 2. In step (2) we find there are three usable
                               (forward) edges: (a, b), (a, d), and (a, g). Following the prescribed order, we select (a, b),
                               label b with (a, 2), place b in P, and increase the counter to 3. Executing step (2) a second
                               time, we select (a, d), label d with (a, 3), place d in P, and increase the counter to 4. At
                               this point, step (2) is executed a third time, for edge (a, g). So we label g with (a, 4), place
                               g in P, and increase the counter to 5.
                                   The edge (b, j) is usable with b having the smallest counter label. [None of the edges
                               (a, b), (a, d), (a, g) is uSable at this stage.] Now in step (2) the vertex / is labeled with
                               (b, 5), b is placed in P, and the counter is increased to 6. For the vertex d in P, the edge
656             Chapter 13 Optimization and Matching

(k, d) is not usable because the flow in this edge is 0. The next application of step (2),
                                consequently, results in the label (d, 6) on z, places z in P, and increases the counter to
                                7. But with z in P we are finished with step (2), and so we arrive at the partial breadth-
                                first spanning tree (for the undirected graph associated with N) rooted at a—as shown
                                in Fig. 13.15(ii). Backtracking in step (3) of the Edmonds-Karp algorithm now provides
                                the f-augmenting path p: (a, d), (d, z), where A, = min{3 — 0, 5 — 0} = 3, as shown in
                                Fig. 13.15(iii).
                                     At this point, we go to step (2) of the Ford-Fulkerson algorithm and increase the flow
                                on (a, d) from 0 to 3 and that on (d, z) from 0 to 3. The result is the transport network in
                                Fig. 13.15{iv), where val(f) = 3.
                                     We now return to the Edmonds-Karp algorithm to determine the next f-augmenting path.
                                The resulting partial breadth-first spanning tree for this is shown in part (v) of the figure.
                                The corresponding f-augmenting path p in Fig. 13.15(vi) has tolerance A, = min{3 — 0,
                                6 — 0, 4 —0, 5 — 3} = 2. Step (2) of the Ford-Fulkerson algorithm then provides the net-
                                work in Fig. 13.15(vii), where val(f) = 3+ A, = 5. The next (similar) iteration takes us
                                from this transport network to the one in Fig. 13.15(x), where the flow is now 6. When the
                                Edmonds-Karp algorithm is invoked at this stage, the resulting breadth-first spanning tree is
                                shown in Fig. 13.15(xi). In this application of the algorithm, after we label d with (k, 5), we
                                next label # because we now have the usable (back) edge (h, d) — for the flow from h tod
                                is 2 (> 0). Backtracking from z to a in the tree in part (xi) results in the f-augmenting path
                                p in part (xii)  with A, = min{4 — 0,6 —0,5—0,4—0,2,4—1,8-—1,7-1} =2.
                                     This now brings us to the transport network in Fig. 13.16(1), where val( f) = 8. If we
                                try to apply the Edmonds-Karp algorithm to find the next f-augmenting path, we obtain
                                the partial breadth-first spanning tree in Fig. 13.16(ii). At this point, P = {a, b, j, k, d} so
                                z ¢ P, and there are no other usable edges. Consequently, the last line of step (2) provides the
                                 minimum cut (P, P), whereP = {g, h, m,n,                      z}, as shown in Fig. 13.1611). Further, from
                                 the edges that are crossed by the dotted curve, we have val(f) = f((a@, g)) + f((d, z)) —
                                 f((h, d)) =3+5-0=8=c(P, P).

b6,2    45,2   k                                 b(a, 2) 4(b, 3)
                                                                                              kK(y, 4)

a      ,1)               d(k, 5)

(11)                       P = {a, b, j,k, a}!    ()                           (P. P)

Figure 13.16

We close this section with three examples that are modeled with the concept of the
                                 transport network. After setting up the models, the final solution of each example is left to
                                 the Section Exercises.

Computer chips are manufactured (in units of a thousand) at three companies, c;, cz, and
      EXAMPLE 13.12
                                 c3. These chips are then distributed to two computer manufacturers, m, and m2, through
                                 the “transport network”            in Fig.   13.17(a), where there are the three sources —c},      c2, and
                                 c3 — and the two sinks, m,; and m2. Company c; can produce up to 15 units, company c?
                                 up to 20 units, and company c3 up to 25 units. If each manufacturer needs 25 units, how
                                                           13.3. Transport Networks: The Max-Flow Min-Cut Theorem          657

many units should each company produce so that together they can meet the demand of
                each manufacturer or at least supply them with as many units as the network will allow?

(a)                                                  (bd)
                Figure 13.17

In order to model this example with a transport network, we introduce a source a and a
                sink z, as shown in Fig. 13.17(b). The manufacturing capabilities of the three companies
                are then used to define capacities for the edges (a, c)), (a, cz), and (a, c3). For the edges
                (m;, z) and (m2, z) the demands are used as capacities. To answer the question posed here,
                one applies the Edmonds-Karp and Ford-Fulkerson algorithms to this network to find the
                value of a maximum flow.

The transport network shown in Fig. 13.18(a) has an added restriction, for now there are
EXAMPLE 13.13   capacities assigned to vertices other than the source and sink. Such a capacity places an
                upper limit on the amount of the commodity in question that may pass through a given
                vertex. Part (b) of the figure shows how to redraw the network in order to obtain one where
                the Edmonds-Karp and Ford-Fulkerson algorithms can be applied. For each vertex v other
                than a or z, split v into vertices v; and v2. Draw                 an edge from v; to v2 and label it with
                the capacity originally assigned to v. An edge of the form (v, w), where v # a, w F# Z,
                then becomes           the edge     (v2, w;), maintaining      the capacity of (v, w). Edges        of the form
                (a, v) become (a, v1) with capacity c(a, v). An edge such as (w, z) is replaced by the edge
                (w2, Z), with capacity c(w, z).

b(15)    10         (15)                              6,156,    10 d,15d,
                               10                               10

a                         15     a0          Zz              UN

15 N\                            5
                                    g(20)     15        A(10)                             g,20g,    15,104h,
                       (a)
                  Figure 13.18

The maximum flow for the given network is now determined by applying the Edmonds-
                Karp and Ford-Fulkerson algorithms to the network shown in Fig. 13.18(b).

During the practice of war games, messengers must deliver information from headquarters
EXAMPLE 13.14   (vertex a) to a field command station (vertex z). Since certain roads may be blocked or
658             Chapter 13 Optimization and Matching

destroyed, how many messengers should be sent out so that each travels along a path that
                                  has no edge in common with any other path taken?
                                      Since the distances between vertices are not relevant here, the graph shown in Fig. 13.19
                                  has no capacities assigned to its edges. The problem here is to determine the maximum
                                  number of edge-disjoint paths from a to z. Assigning each edge a capacity of 1 converts the
                                  problem into a maximum-flow problem, where the number of edge-disjoint paths (from a
                                  to z) equals the value of a maximum flow for the network.

b            A

Y
                                                                  Vv

y
                                                                       Y
                                                                  A

g
                                                                       Y
                                                   Figure 13.19

13 13a          Ake}

1. a) For the network shown in Fig. 13.20, let the capacity of
   each edge be 10. If each edge e in the figure is labeled by a
   function f, as shown, determine the values of s, t, w, x, and
   y so that f is a flow in the network.
     b) What is the value of this flow?
     c) Find three cuts (P, P) in this network that have capac-
     ity 30.

Figure 13.21
        Figure 13.20
                                                                       6. In each of the following “transport networks” two compa-
                                                                       nies, ¢, and cz, produce a certain product that is used by two
2.    Prove Corollaries   13.3 and 13.4.
                                                                       manufacturers, m1, and m2. For the network shown in part (a) of
3. Find a maximum flow and the corresponding minimum cut               Fig. 13.22, company c, can produce 8 units and company c2 can
for each transport network shown in Fig. 13.21.                        produce 7 units; manufacturer m, requires 7 units and manufac-
                                                                       turer m2 needs 6 units. In the network shown in Fig. 13.22(b),
4. Apply the Edmonds- Karp and Ford-Fulkerson algorithms to
                                                                       each company can produce 7 units and each manufacturer needs
find a maximum flow in Examples 13.12, 13.13, and 13.14.
                                                                       6 units. In which situation(s) can the producers meet the man-
5. Prove Corollary 13.5.                                               ufacturers’ demands?
                                                                                                      13.4 Matching Theory                     659

7. Find a maximum flow for the network shown in Fig. 13.23.
                                                                       The capacities on the undirected edges indicate that the capac-
                                                                       ity is the same in either direction. [However, for an undirected
                                                                       edge a flow can go in only one direction at a time as opposed to
                                                                       the situation for vertices b, g in Fig. 13.18(a).]

b       4     d       6        f

7              4             4                     5
                                                                                              5         AS          SA
       (a)                                                                         6          g   4             4            i        7
                                                                           a       >              >             >                —>       Zz
                                                                                                  4         A   4
                                                                                          v5           Y 5          5
                                                                                   4                                                  5
         C                                                                                        >             >
                                                                                          J       6     k       4       m
                                                                           Figure 13.23

(0)
     Figure 13.22

13.4
                    Matching Theory
                                  The Villa school district must hire four teachers to teach classes in the following subjects:
                                  mathematics (s;), computer science (s2), chemistry (53), physics (s4), and biology (s;). Four
                       5;
                                  candidates who are interested in teaching in this district are Miss Carelli (c;), Mr. Ritter
                                  (cz), Ms. Camille (c3), and Mrs. Lewis       (cs). Miss Carelli is certified in mathematics and
                       $2         computer science; Mr. Ritter in mathematics and physics; Ms. Camille in biology; and Mrs.
o                                Lewis in chemistry, physics, and computer science. If the district hires all four candidates,
                       s          can each teacher be assigned to teach a (different) subject in which he or she is certified?
C3                          3
                                     This problem is an example of a general situation called the assignment problem. Using
Cy                     Sa
                                  the Principle of Inclusion and Exclusion in conjunction with the rook polynomial (see
                                  Sections 8.4 and 8.5), one can determine in how many ways, if any, the four teachers may
                       Ss         be assigned so that each teaches a different subject for which he or she is qualified. However,
Figure 13.24                      these techniques do not provide a means of setting up any of these assignments. In Fig. 13.24
                                  the problem is modeled by means of a bipartite graph G = (V, E), where V is partitioned
                                  as X UY with X = {c1, c2, ¢3, ca} and Y = {s1, 52, 53, 84, 85}, and the edges of G represent
                                  the qualifications for the individual teachers. The edges {c;, 52}, {e2, sa}, {c3, 85}, {c4, 53}
                                  demonstrate such an assignment of X into Y.
                                     To examine this idea further, the following concepts are introduced.

Definition 13.9              Let G = (V, E) bea bipartite graph with V partitioned as X U Y. (Each edge of E has the
                                  form {x, y} with x € X and y € Y.)

a) A matching in G   is a subset of E such that no two edges share a common vertex in X
                                          or Y.
660             Chapter 13 Optimization and Matching

b) A complete matching of X into Y is a matching in G such that every x € X is the
                                      endpoint of an edge.

In terms of functions, a matching is a function that establishes a one-to-one correspon-
                                dence between a subset of X and a subset of Y. When the matching is complete, a one-to-one
                                function from X into Y is defined. The example in Fig. 13.24 contains such a function and
                                a complete matching.

For a bipartite graph G = (V, E) with V partitioned as X U Y, a complete matching of
                                X into Y requires |X| < |Y|. If |X| is large, then the construction of such a matching cannot
                                be accomplished just by observation or trial and error. The following theorem, due to the
                                English mathematician Philip Hall (1935), provides a necessary and sufficient condition for
                                the existence of such a matching. The proof of the theorem, however, is not that given by
                                Hall. A constructive proof that uses the material developed on transport networks is given.

THEOREM 13.7                     Let G = (V, E) be bipartite with V partitioned as X U Y. Acomplete matching of X into
                                 Y exists if and only if for every subset A of X, |A| <|R(A)|, where R(A) is the subset of
                                 Y consisting of those vertices each of which is adjacent to at least one vertex in A.

Before proving the theorem, we illustrate its use in the following example.

a) The bipartite graph shown in Fig. 13.25(a) has no complete matching. Any attempt
        EXAMPLE 13.15
                                      to construct such a matching must include {x,, y;} and either {x2, y3} or {x3, ys}.
                                      If {x2, y3} is included, there is no match for x3. Likewise, if {x3, y3} is included,
                                             we   are not able to match   x2. If A = {x), x2, x3} C X, then    R(A)   = {y1, y3}. With
                                             |A| = 3 > 2 = |R(A)|, it follows from Theorem 13.7 that no complete matching can
                                            exist.
                                                                                  Table 13.2

A            R(A)            JA] | |R(A)j

d             Gy                     0       0
                                                                                    {x1}           {¥1, ¥2, ¥3}          1       3
                                                                                    {x2}           {y2}                  l       1
                                                                                    {x3}           {y2, ¥3. ys}          1       3
                                                                                    {x4}           {y4, Ys}              1       2
            7                                 x                y                    {Xt, X2}       {¥1, Y2, yah          2       3
       x,                            x,                            y;               {x1, x3}       (Yi, Y2,.¥3, Ya} | 2          4
                                                                                    {x1,   x4}     Y                     2       5

x,                                 %                          Vo               {x2, x3}       {y2, ¥3, Ys}          2       3
                                                                                    {x2, x4}       {y2, Ya, ys}          2       3
     Xs                                 xy                         Ys               {x3, x4}       {v2, V3, Ya, Ys} |    2       4
                                                                                    {xy, x2, X3} | {¥1, 2, ¥3, Ys} |     3       &
       Xy                            Xa                            Ya               {x1, x2, x4} | ¥                     3       5
                                                                                    {X1, %3, X4} | Y                     3       5
(a)                              (b)                               Ys              {x2, x3, x4} | {y2, y3, yas ys} | 3          4
.                                                                                  xX             Y                     4       5
Figure 13.25
                                                             13.4 Matching Theory          661

b) For the graph in part (b) of the figure, consider the exhaustive listing in Table 13.2.
     Assuming the validity of Theorem 13.7, this listing indicates that the graph contains a
     complete matching.

We turn now to a proof of the theorem.
Proof: With V partitioned as X UY, let X = {x), x2,..., Xm} and ¥Y ={y), y2,..., yn}.
Construct a transport network N that extends graph G by introducing two new vertices a
(the source) and z (the sink). For each vertex x;, 1 <i   <m,   draw edge (a, x,); for each
vertex yj, 1 < j <n, draw edge (y;, z). Each new edge is given a capacity of 1. Let M be
any positive integer that exceeds |X|. Assign each edge in G the capacity M. The original
graph G and its associated network N appear as shown in Fig. 13.26. It follows that a
complete matching exists in G if and only if there is a maximum flow in N that uses all
edges (a, x;), 1 <i < _m. Then the value of such a maximum flow is m = |X|.

(G)                                                 (N)
          x                 Y
                                Vi

Figure 13.26

We shall prove that there is a complete matching in G by showing that c(P, P) > |X|
for each cut (P, P) in N. Soif (P, P) is an arbitrary cut in the transport network N, let us
define A = X 1 Pand B= YM P.ThenA C X where            we shall write A = {x1, x2,.... x}
for some 0 <i < m., (The elements of X are relabeled, if necessary, so that the subscripts
on the elements of A are consecutive. When i = 0, A = @.) Now P consists of the source
a together with the vertices in A and the set B C Y, as shown in Fig. 13.27(a). (Elements
ofY are also relabeled if necessary.) In addition, P = (X — A) U(Y — B)U {z}. If there is
an edge {x, y} with x € A and y € (Y — B), then the capacity of that edge is a summand in
c(P, P) and c(P, P) > M > |X|. Should no such edge exist, then c(P, P) is determined
by the capacities of (1) the edges from the source a to the vertices in X — A and (2)
the edges from the vertices in B to the sink z. Since each of these edges has capacity
1, c(P, P) ={X — Al + |B] =|X{—JA]+ |B|. With B > R(A), we have |B] > |R(A)|,
and since |R(A)| > |A|, it follows that |B| > |A|. Consequently, c(P, P) = |X| + (|B| —
|A|) = |X|. Therefore, since every cut in network N has capacity at least |X| and the cut
({a}, V — {a}) achieves a capacity of |X|, by Theorem 13.6 any maximum flow for NV has
662          Chapter 13 Optimization and Matching

value |X|. Such a flow will result in exactly |X| edges from X to Y having flow 1, and this
                             flow provides a complete matching of X into Y.

(b)
                              Figure 13.27

Conversely, suppose that there exists a subset A of X where |A| > |R(A)|. Let (P, P)
                             be the cut shown for the network in Fig. 13.27(b), with P = {a} UAU R(A) and P =
                             (X — A) U(Y — R(A)) U {z}. Thenc(P, P) is determined by (1) the edges from the source
                             a to the vertices in X — A and (2) the edges from the vertices in R(A) to the sink z.
                             Hence c(P, P) = |X — A] + |R(A)| = |X| — (JA] — |R(A)]) < |X|, since |A| > |R(A)|.
                             The network has a cut of capacity less than |X|, so once again by Theorem 13.6 it follows
                             that any maximum flow in the network has value smaller than |X|. Therefore there is no
                             complete matching from X into Y for the given bipartite graph G.

Five students, 51, 52, 53, 54, and ss, are members     of three committees, c,, c2, and c3. The
      EXAMPLE 13.16
                             bipartite graph shown in Fig. 13.28(a) indicates the committee memberships. Each com-
                             mittee is to select a student representative to meet with the school president. Can a selection
                             be made in such a way that each committee has a distinct representative?

S;

C,                   S>

©                5;       a

G                Ss

5s
                                       (a)                           (b)
                                      Figure 13.28
                                                                                13.4 Matching Theory         663

Although this problem is smal] enough to solve by inspection, we use the ideas developed
                 in Section 13.3. Figure 13.28(b) provides the network for the given bipartite graph. Here
                 we consider the vertices, other than the source a, ordered as €), C2, C3, S|, 82, §3, 84, 85, Z.
                 In Fig. 13.29(a), the Edmonds-Karp algorithm is applied for the first time and provides the
                 f-augmenting path p: (a, c1), (cr, 53), (83, z) with A, = 1. Applying the Ford-Fulkerson
                 algorithm results in the network in part (b) of the figure, and this network indicates the
                 edge (c1, 3) as the start for a possible complete matching. [Many edge labels are omitted
                 in parts (b) and (c) of the figure in order to simplify the diagrams. Every unlabeled edge
                 that starts at a or terminates at z should have the label 1, 0 to indicate a capacity of 1 and
                 a flow of 0; all other unlabeled edges should bear the label M, 0.] The next application of
                 these two algorithms provides the f-augmenting path (a, cz), (c2, 51), (S;, z) and the edge
                 (C2, S}) to extend the matching. Finally, the last application of the Edmonds-Karp and Ford-
                 Fulkerson algorithms gives us the f-augmenting path (a, c3), (C3, 2), (82, z) and the final
                 edge  — namely, (c3, s2) — for the complete matching. This is indicated by the maximum
                 flow in part (c) of Fig. 13.29.

$4{Cz, 7)

2(S3,   1 0)

S5(C>, 8)
  (a)

Figure 13.29

This example is a particular instance of a problem studied by Philip Hall. He considered a
                 collection of sets A1, Az, ..., Ay, Where the elements a), a2, ... , a, were called a system
                 of distinct representatives for the collection if (a) a; € A,, forall 1 <i <n; and (b) a; # aj,
                 whenever 1 <i < j <n.Rewording Theorem 13.7 in this context,     we find that the collection
                 A, A2,..., Ay has a system of distinct representatives if and only if, for all 1 <i <n, the
                 union of any! of the sets Aj, A2,...,     A, contains at least 7 elements.

Although the condition in Theorem 13.7 may be very tedious to check, the following
                 corollary provides a sufficient condition for the existence of a complete matching.

COROLLARY 13.6   Let G = (V, E) be a bipartite graph with V partitioned as X UY. There is a complete
                 matching of X into Y if, for some k € Z*, deg(x) > k > deg(y) for all vertices x € X and
                 yey.
                 Proof: This proof is left for the Section Exercises.
664           Chapter 13 Optimization and Matching

a) Corollary 13.6 is applicable to the graph shown in Fig. 13.28(a). Here the appropriate
      EXAMPLE 13.17
                                    value of k is 2.
                                 b) There are 50 students (25 females and 25 males) in the senior class at Bel! High School.
                                    If each female in the class is appreciated by exactly five of the males, and each male
                                    enjoys the company of exactly five of the females in the class, then it is possible for each
                                    male to go to the class party with a female he likes and each female will attend with a
                                    male who likes her. (As a result of problems of this type, the condition in Theorem 13.7
                                    has often been referred to in the literature as Hail’s Marriage Condition.)

For problems such as the one in Example 13.15(a), where a complete matching does not
                              exist, the following type of matching is often of interest.

Definition 13.10        If G = (V, E) is a bipartite graph with V partitioned as X UY, a maximal matching in G
                              is one that matches as many vertices in X as possible with the vertices in Y.

To investigate the existence and construction of a maximal matching, the following new
                              idea is presented.

Definition 13.11        Let G = (V, E) be a bipartite graph, where V is partitioned as X UY. If AC X, then
                              5(A) = |A| — |R(A)} is called the deficiency of A. The deficiency of graph G, denoted
                              5(G), is given by 6(G) = max{d(A)|A C X}.

For 4 C X, we have R(¥) = ¥, so 5(4) = 0 and d(G) > 0. If 5(G) > 0, there is a subset
                              A of X with |A| — |R(A)| > 0,so0|A| > | R(A)| and from Theorem 13.7 we know that there
                              is no complete matching of X into Y.

The graph in Fig. 13.30(a) has no complete matching. [See Example                    13.15(a).] For A =
      EXAMPLE 13.18
                               {x1, X2, x3}, we find that R(A)      = {y), y3} and 6(A) = 3 —2     = 1. As a       result of this subset
                              A we find that 6(G) = 1. Removing one of the vertices from A (and the edges incident
                              with it), we obtain the subgraph shown in part (b) of the figure. This (bipartite) subgraph
                              contains a complete matching from X; = {xX2, x3, x4} into Y. The edges {x2, y;}, {x3, y3},
                              and {x4, y4} indicate one such matching that is also a maximal matching of X into Y.

x         Y                   X,          Y
                                                           x;                 yy                              4

Xz                 Y2         Xz                   ¥2

x3                 y3         X3                   y3

X4                 Ya         Xa                   Va

Vs                              Vs
                                                     (a)                           (b)
                                                 Figure 13.30
                                                                                                   13.4 Matching Theory           665

The ideas developed in Example 13.18 lead to the following theorem.

THEOREM 13.8                    Let G = (V, E) be bipartite with V partitioned as X U Y. The maximum number of vertices
                                in X that can be matched with those in Y is |X| — 6(G).
                                Proof: We provide a constructive proof, using transport networks as in the proof of The-
                                orem 13.7. As in Figure 13.26, let N be the network associated with the bipartite graph
                                G. The result will follow when we show that (a) the capacity of every cut (P, P) in N is
                                greater than or equal to |X| — 6(G), and (b) there exists a cut with capacity |X| — 6(G).
                                    Let (P, P) be a cut in N, where P is made up of the source a, the vertices in A =
                                PX CX, and the vertices in B = PMY CY. [See Fig. 13.27(a).] As in the proof of
                                Theorem 13.7, the subsets A, B may be @.

1) If edge (x, y) is in N with x € A and y € Y — B, then c(x, y) is a summand                    in
                                      c(P, P). Since c(x, y) = M > |X|, it follows that c(P, P) > |X} > |X| — 8(G).
                                   2) If no such edge as in (1) exists, thenc(P, P) is determined by the |X — A| edges from
                                       ato X — A andthe |B| edges from B       to z. Since each of these edges has capacity 1, we
                                      tind that c(P, P) = |X — AJ + |B] =|X|—|A|+ |B}. No edge connects a vertex in
                                       A with a vertex in Y — B, so R(A) C B and |R(A)| < |B]. Consequently, c(P, P)=
                                       (|X} — |A|) + |B] = 1X] — |Al) + [R(A)| = |X] — GA] — | R(A))) = |X] — 5A) =
                                       |X| — d(G).

Therefore, in either case, c(P, P) > |X| — 6(G) for every cut (P, P)yinN.

To complete the proof, we must establish the existence of a cut with capacity |X| — 6(G).
                                Since 6(G) = max{d(A)|A © X}, we can select a subset A of X with 6(G) = 6(A). Ex-
                                amining Fig. 13.27(b), we let P = {a} UU A U R(A). Then P = (X — A) U(Y — R(A))U
                                {z}. There is no edge between the vertices in A and those in Y — R(A), so c(P, P) =
                                |X — A] + |R(A)| = |X| — CA] — | R(A)|) = |X} — (A) = |X} — 8(G6).

We close this section with an example that deals with these concepts.

Let G = (V, E) be bipartite with V partitioned as X U Y. For each x € X, deg(x) > 4 and,
    EXAMPLE 13.19
                               for each y € Y, deg(y) < 5. If |X| < 15, find an upper bound (as small as possible) for 5(G).
                                   Let @ # AC X and let E, C E, where £, = {{a, b}}ja € A, b € R(A)}. Since deg(a) >
                               4 for all ae A, |F,| > 4|A}. With deg(b) <5 for all b € R(A), |E;| < 5|R(A)}. Hence
                               4|A| <5} R(A)| and (A) = |A| — |R(A)| < [A] — (4/5)|A| = C1/5)|AI. Since A C X, we
                               have |A| < 15,s06(A) < (1/5)(15) = 3. Consequently, 6(G) = max{é(A)|A C X} <3,so
                               there exists a maximal matching M of X into Y such that |M| > |X| — 3.

the associated network for the graph in part (a) and determine
                                                                     a maximum flow for this network. What complete matching
                                                                     does this determine? (c) Is there a complete matching that pairs
  1. For the graph shown in Fig. 13.24, if four edges are selected
                                                                     Janice with Dennis and Nettie with Frank? (d) Is it possible to
at random, what is the probability that they provide a complete
                                                                     determine two complete matchings where each man is paired
matching of X into Y?
                                                                     with two different women?
2. Cathy is liked by Albert, Joseph, and Robert; Janice by
Joseph and Dennis; Theresa by Albert and Joseph; Nettie by             3. At Rydell High School the senior class is represented on six
Dennis, Joseph, and Frank; and Karen by Albert, Joseph. and          school committees by Annemarie (A), Gary (G), Jill (J), Ken-
Robert. (a) Set up a bipartite graph to model the matching prob-     neth (K), Michael (M), Norma (N), Paul (P), and Rosemary
lem where each man is paired with a woman he likes. (b) Draw         (R). The senior members of these committees are {A, G, J, P},
666            Chapter 13 Optimization and Matching

{G, J, K, R}, {A, M, N, P}, {A, G, M,N, P}, {A, G, K, N, R},                   Cc)   Ay    =    (1, 2},        A   =   {2, 3,   4},   Aj   =        {2, 3},   Ag   =   {1, 3},

and {G, K, N, R}. (a) The student government calls a meeting                   As    =    {2,   4}
that requires the presence of exactly one senior member from                9, a) Determine all systems of distinct representatives for the
each committee. Find a selection that maximizes the number of                  collection of sets A, = {1,2}, Az = {2, 3}, Az = {3, 4},
seniors involved. (b) Before the meeting, the finances of each                 Aq = {4, 1}.
committee are to be reviewed by a senior who is not on that com-
                                                                               b) Given the collection of sets A, = {1,2}, Az=
mittee. Can this be accomplished so that six different seniors
                                                                               {2, 3},..., Ay = {n, 1}, determine how many different
are involved in this review process? If so, how?
                                                                               systems of distinct representatives exist for the collection.
4. Let G = (V, E) be a bipartite graph with V partitioned as
                                                                           10. Let Aj, Ao,..., A, be a collection of sets, where A; =
X UY,whereX = {x,, x2,..., X,}and ¥Y = {y,, yo, ..., yn}.
How many complete matchings of X into Y are there if                       Az, =---=A, and |A,| =k        > 0 for all 1 <i <n. (a) Prove
                                                                           that the given collection has a system of distinct representa-
      a) m=2,n=4,andG              = K,,,,?                                tives if and only ifn <k. (b) When wn <k, how many different
      b) m=4,n      =4,andG        = K,,,,?                                systems exist for the collection?
      ec) m=5,n=9,andG=K,,,?
                                                                           11. LetG = (V, E) bea bipartite graph, where V is partitioned
      d) m<nandG        = K,,,,?                                           as X UY. If deg(x) > 4 for all x € X and deg(y) <5 for all
  5. If G = (V, E) is an undirected graph, a spanning subgraph             y € Y, prove that if |X| < 10 then 6(G) <2.
H of G in which each vertex has degree | is called a one-factor
                                                                           12. Let G = (V, E) be bipartite with V partitioned as X UY.
(or perfect matching) for G.
                                                                           For all x € X, deg(x) > 3, and for all ye Y, deg(y) <7. If
      a) If G has a one-factor, prove that |V| is even.                    |X| < 50, find an upper bound (that is as small as possible) on
      b) Does the Petersen graph have a one-factor? (The Pe-               b(G).
      tersen graph was first introduced in Example 11.19.)
                                                                           13. a) Let G=(V, E) be the bipartite graph shown in
      c) In Fig. 13.31 we find the graph K, in part (a), while part            Fig. 13.32, with V partitioned as X U Y. Determine 5(G)
      (b) provides the three possible one-factors for K;. How                  and a maximal matching of X into Y.
      many one-factors are there for the graph Kg?
      d) Forn € Z*, let a, count the number of one-factors that                                                x                           Y
      exist for the graph K>2,. Find and solve a recurrence relation                                      xy                                   MY
      for a,.
6. Prove Corollary 13.6.                                                                                 x2                                   ¥2
  7. Fritz is in charge of assigning students to part-time jobs at
the college where he works. He has 25 student applications, and
                                                                                                          X3                                   y3
there are 25 different part-time jobs available on the campus.
Each applicant is qualified for at least four of the jobs, but each
job can be performed by at most four of the applicants. Can                                               X4                                   Ya
Fritz assign all the students to jobs for which they are qualified?
Explain.                                                                                                  Xs
  8. For each of the following collections of sets, determine, if                                         Figure 13.32
possible, a system of distinct representatives. If no such system
exists, explain why.
                                                                               b) For any bipartite graph G = (V, E), with V partitioned
      a) A; = {2, 3, 4}, Ar = {3, 4}, As = {1}, Ag = (2, 3}                    as X U Y, if B(G) denotes the independence number of G,
      b) A; = Ap = Az = {2, 4, 5}, Ag = As = {1, 2,3, 4, 5}                    show that |Y| = 6(G) — 5(G). (The independence number

b

a                              oO               a                                   a

d                  c       dea
                         (a)                        (b)
                        Figure 13.31
                                                                                  13.5 Summary and Historical Review        667

of an undirected graph is defined in Exercise 25 for Sec-          14. For n > 2, prove that the hypercube Q,, has at least 2°"
tion 11.5.)                                                        perfect matchings (as defined above in Exercise 5).
¢) Determine a largest maximal independent set of vertices
for the graphs shown in Fig. 13.30(a) and Fig. 13.32.

13.5
   Summary and Historical Review
                         This chapter has provided us with a sample of the ways in which graph theory enters into an
                         area of mathematics called operations research. Each topic was presented in an algorithmic
                         manner that can be used in the computer implementation needed for solving each type of
                         problem. Comparable coverage of this material can be found in Chapters 10 and 11 of the
                         text by C. L. Liu [22]. Chapters 4 and 5 of E. Lawler [21] offer an extensive coverage of
                         many other developments on networks and matching. This text provides a wide variety of
                         applications and includes references for additional reading.
                             In Section 13.1 we examined a shortest-path algorithm for weighted graphs. The full
                         development of the algorithm is given in the article by E. W. Dijkstra [10].

Edsger W. Dijkstra (1930-2002)                           Joseph B. Kruskal (1928- )

Section 13.2 provided two techniques for finding a minimal spanning tree in a weighted
                         loop-free connected undirected graph. These techniques were developed in the late 1950s
                         by J. B. Kruskal    [20] and R. C. Prim         [25]. Actually, however,    methods    for constructing
                         minimal spanning trees can be traced back to 1926, to the work of Otakar Bortivka deal-
                         ing with the construction of an electric power network. Even before this (1909-1911) the
                         anthropologist Jan Czekanowski, in his work on various classification schemes, was very
                         close to recognizing the minimal spanning tree problem and to providing a greedy algorithm
                         for its solution. The survey paper by R. L. Graham and P. Hell [16] mentions the contribu-
                         tions made by Bortivka and Czekanowski and gives more information on the history and
                         applications of this structure.
                              The computer implementation of all the techniques given in the first two sections can be
                         found in Chapters 6 and 7 of A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1]; in Chapter 8
                         of S. Baase and A. Van Gelder [3]; in Chapters 23 and 24 of T. H. Cormen, C. E. Leiserson,
668   Chapter 13 Optimization and Matching

R. L. Rivest, and C. Stein [7]; and in Chapter 4 of E. Horowitz and S. Sahni [17]. These
                      references also discuss the efficiency and speed of these algorithms. Sections 4.5-4.9 of
                      the text by R. K. Ahuja, T. L. Magnanti, and J. B. Orlin [2] provide more on different
                      implementations of Dijkstra’s algorithm, along with discussions on their features and worst-
                      case time-complexities. Six applications of the algorithm are described in Section 4.2 of
                      this text. As we mentioned at the end of Section 13.2, the articles by R. L. Graham and
                      P. Hell [16], by D. B. Johnson [18], and by A. Kershenbaum and R. Van Slyke [19] discuss
                      other implementations of Prim’s algorithm. An interesting application of the concept of the
                      minimal spanning tree in a physical science setting is provided in the article by D. R. Shier
                      [27]. Other applications are discussed in Section 13.2 of R. K. Ahuja, T. L. Magnanti, and
                      J.B. Orlin [2].
                          As we noted in Section 13.3, problems dealing with the allocation of resources or the
                      shipment of goods can be modeled by means of transport networks. The fundamental work
                      by G. B. Dantzig, L. R. Ford, and D. R. Fulkerson can be found in their pioneering articles
                       [8, 9, 12, 13]. The classic text by L. R. Ford and D. R. Fulkerson [14] provides excellent
                      coverage of this topic. In addition, the reader may wish to examine Chapter 6               of R. K.
                      Ahuja, T. L. Magnanti,      and J. B. Orlin [2], Chapter 8 of the text by C. Berge [4], Chapter
                      7 of the book by R. G. Busacker and T. L. Saaty [6], or Chapter 26 of T. H. Cormen, C.
                      E. Leiserson, R. L. Rivest and C. Stein [7]. Chapter 10 in C. L. Liu [22] includes coverage
                      on an extension to networks wherein the flow in each edge is restricted by a lower as well
                      as an upper capacity. For more applications the reader should examine the article by D. R.
                      Fulkerson on pages 139-171 of [15]. Section 6.2 of R. K. Ahuja, T. L. Magnanti, and J. B.
                      Orlin [2] contains six additional applications.
                          The last topic discussed here dealt with matching in a bipartite graph. The theory behind
                      this was first developed by Philip Hall in 1935, but here the ideas on transport networks were
                      used to provide an algorithm for a solution. Chapter 7 of the text by O. Ore [24] provides a
                      very readable introduction to this topic, along with some applications. For more on systems
                      of representatives, the reader should examine Chapter 5 of the monograph by H. J. Ryser
                      [26]. A second method for finding a maximal matching in a bipartite graph is called the
                      Hungarian method. This is given in Chapter 5 of the text by J. A. Bondy and U.S. R. Murty
                      [5] and in Chapter 10 of the book by C. Berge [4]. In addition to its application in solving
                      the assignment problem, matching theory has many interesting combinatorial implications.
                      One may learn more about these in the survey article by L. Mirsky and H. Perfect [23].

REFERENCES
                           1, Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffrey D. Data Structures and Algorithms.
                              Reading, Mass.: Addison-Wesley, 1983.
                          2. Ahuja, Ravindra K., Magnanti, Thomas L., and Orlin, James B. Network Flows. Englewood
                            Cliffs, N.J.: Prentice Hall, 1993.
                          3. Baase, Sara, and Van Gelder, Allen. Computer Algorithms, Introduction to Design and Analysis,
                             3rd ed. Reading Mass.: Addison-Wesley, 2000.
                          4, Berge, Claude. The Theory of Graphs and Its Applications. New York: Wiley, 1962.
                          5. Bondy, J. A., and Murty, U. S. R. Graph Theory with Applications. New York: Elsevier North
                             Holland, 1976.
                          6. Busacker, Robert G., and Saaty, Thomas L. Finite Graphs and Networks. New York: McGraw-
                             Hill, 1965.
                          7. Cormen, Thomas H., Leiserson, Charles E., Rivest, Ronald L., and Stein, Clifford. Introduction
                             to Algorithms, 2nd ed. New York: McGraw-Hill, 2001.
                          8. Dantzig, George B., and Fulkerson, Delbert Ray. Computation of Maximal Flows in Networks.
                            The RAND     Corporation, P-677,     1955.
                                                                                                            Supplementary Exercises            669

9. Dantzig, George B., and Fulkerson, Delbert Ray. On the Max Flow Min Cut Theorem. The
                                            RAND Corporation, RM-1418-1, 1955.
                                       10. Dijkstra, Edsger W. “A Note on Two Problems in Connexion with Graphs.” Numerische Math-
                                            ematik 1 (1959): pp. 269-271.
                                           . Edmonds,   Jack, and Karp, Richard M. “Theoretical Improvements           in Algorithmic Efficiency


                                       —
                                            for Network Flow Problems.” J. Assoc. Comput. Mach. 19 (1972): pp. 248-264.
                                       12. Ford, Lester R., Jr. Network Flow Theory. The RAND          Corporation, P-923,    1956.
                                       13. Ford, Lester R., Jr., and Fulkerson, Delbert Ray. “Maximal Flow Through a Network.” Canadian
                                            Journal of Mathematics 8 (1956): pp. 399-404.
                                       14. Ford, Lester R., Jr., and Fulkerson, Delbert Ray. Flows in Networks. Princeton, N.J.: Princeton
                                            University Press,   1962.
                                       15. Fulkerson, Delbert Ray, ed. Studies in Graph         Theory, Part I. MAA     Studies in Mathematics,
                                            Vol. 11, The Mathematical Association of America,       1975.
                                       16. Graham, Ronald L., and Hell, Pavol. “On the History of the Minimum Spanning Tree Problem.”
                                           Annals of the History of Computing 7, no. 1 (January 1985): pp. 43-57.
                                       17. Horowitz, Ellis, and Sahni, Sartaj. Fundamentals of Computer Algorithms. Potomac, Md.: Com-
                                           puter Science Press, 1978.
                                       18. Johnson, D. B. “Priority Queues with Update and Minimum Spanning Trees.” Information
                                           Processing Letters 4 (1975): pp. 53-57.
                                       19. Kershenbaum, A., and Van Slyke, R. “Computing Minimum Spanning Trees Efficiently.” Pro-
                                           ceedings of the Annual ACM Conference, 1972, pp. 518-527.
                                       20. Kruskal, Joseph B. “On the Shortest Spanning Subtree of a Graph and the Traveling Salesman
                                           Problem.” Proceedings of the AMS 1, no. | (1956): pp. 48-50.
                                       21. Lawler, Eugene. Combinatorial Optimization: Networks and Matroids. New York: Holt, 1976.
                                       22. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
                                       23. Mirsky, L., and Perfect, H. “Systems of Representatives.” Journal of Mathematical Analysis
                                           and Applications 3 (1966): pp. 520-568.
                                       24. Ore, Oystein. Theory of Graphs. Providence, R.I.: American Mathematical Society, 1962.
                                       25. Prim, Robert C. “Shortest Connection Networks and Some Generalizations.” Bell System Tech-
                                           nical Journal 36 (1957): pp. 1389-1401.
                                       26. Ryser, Herbert J. Combinatorial Mathematics. Carus Mathematical Monographs, Number 14,
                                           Mathematical Association of America, 1963.
                                       27. Shier, Douglas R. “Testing for Homogeneity Using Minimum Spanning Trees.” The UMAP
                                           Journal 3, no. 3 (1982): pp. 273-283.

>. SUPPLEMENTARY EXERCISES
1. Apply Dijkstra’s algorithm to the weighted directed multi-
graph shown in Fig, 13.33, and find the shortest distance from
vertex a to the other seven vertices in the graph.                             7                                   NB                      4
2. For her class in the analysis of algorithms, Stacy writes
the following algorithm to determine the shortest distance                                                        LS
from a vertex a to a vertex b in a weighted directed graph
G=(V,B).                                                                                                          ON
                                                                               Db
   Step 1: Set Distance equal to 0, assign vertex a to the vari-                                      14>                              f
   able    vy and   let   T   =   V.                                           Figure   13.33

Step 2: If v = b, the value of Distance is the answer to the
   problem. If v # 4, then                                                          2) Set Distance equal to Distance + wt(v, w).
          1) Replace 7 by 7 — {v} andselectw € T with wt(v, w)                      3) Assign vertex w to the variable v and then return to
             minimal.                                                                  step (2).
670            Chapter 13 Optimization and Matching

Is Stacy’s algorithm correct? If so, prove it. If not, provide                   each row or column is 1. If
a counterexample.                                                                                              0.2       0.1    0.7
3. a) Let G = (V, E) bea loop-free weighted connected undi-                                               B=/|04         05     O17,
   rected graph. If e; € E with wt(e,) < wt(e) for all other                                                   04        04     0.2
   edges e € E, prove that edge e; is part of every minimal                          verify that B is doubly stochastic.
   spanning tree for G.
                                                                                     c) Find four positive real numbers c), ¢2, c3, and c4, and four
   b) With G as in part (a), suppose that there are edges                            permutation matrices P,, P2, P3, and Py, such that ec; + cz +
   €1, €2 € E with wt(e,) < wt(e2) < wt(e) for all other edges                       ez +c¢4=   land B=c)P,          +c. P, +¢3P; + 4Py.
   e € E. Prove or disprove: Edge e2 is part of every minimal
                                                                                    d) Part (c) is a special case of the Birkhoff-von Neumann
   spanning tree for G.
                                                                                    Theorem: If B is an n Xn doubly stochastic matrix, then
4. a) Let G = (V, E) bea loop-free weighted connected undi-                         there exist positive real numbers c), ¢2,..., Cc, and per-
   rected graph where each edge e of G is part of a cycle. Prove                     mutation matrices P;, P),..., P, such that }°*_,¢; =1
   that if e, € E with wt(e,) > wt(e) for all other edges e € E,                     and )>*_, c;P; = B. To prove this result, proceed as fol-
   then no spanning tree for G that contains e, can be minimal.                      lows: Construct a bipartite graph G = (V, E) with V par-
   b) With    G   as    in   part   (a),   suppose   that   e),e2<¢ EF   with        titioned as X UY, where X = {x,, %2,...,X,} and Y =
   wt(é,) > wt(e2) > wt(e) for all other edges e € E. Prove or                       {¥1, Yo, ---, Ya}.   The   vertex   x,,   for   all   | <i   <x,   corre-
   disprove: Edge e2 is not part of any minimal spanning tree                       sponds with the ith row of B; the vertex y,, forall | <j <n,
   for G.                                                                           corresponds with the jth column of B. The edges of G are of
                                                                                    the form {x,, y,} if and only if b;; > 0. We claim that there
5. Using the concept of flow in a transport network, construct
                                                                                    is acomplete matching of X into Y.
a directed multigraph G = (V, E)}, with V = {u, v, w, x, y}
and id(u) = 1, od(u) = 3; id(v) = 3, od(v) = 3; id(w) = 3,                              If not, there is a subset A of X with |A| > |R(A)]. That
od(w)} = 4; id(x) = 5, od(x) = 4; and id(y) = 4, od{y) = 2.                         is, there is a set of r rows of B having positive entries in s
                                                                                    columns and r > s. What is the sum of these r rows of B?
6. Aset of words {qs, tg, ut, pgr, srt} is to be transmitted us-                    Yet the sum of these same entries, when added column by
ing a binary code for each letter. (a) Show that it is possible to                  column, is less than or equal] to s. (Why?) Consequently, we
select one letter from each word as a system of distinct repre-                     have a contradiction.
sentatives for these words. (b} If a letter is selected at random                       As a result of the complete matching of X into Y, there
from each of the five words, what is the probability that the                       are n positive entries in B that occur so that no two are in
selection is a system of distinct representatives for the words?                    the same row or column. (Why?) If c; is the smallest of
                                                                                    these entries, then we may write B = c, P; + B,, where P;
7. For ne€Z*    and for each 1<i<n,         let A, = {1,2,
                                                                                    is an     X nv permutation matrix wherein the 1’s are located
3,...,n} — {i}. How many different systems of distinct repre-
                                                                                    according to the positive entries in B that came about from
sentatives exist for the collection A;, Az, A3,...,              A,?
                                                                                    the complete matching. What are the sums of the entries in
8. This exercise outlines a proof of the Birkhoff-von Neumann                       the rows and columns of B,?
Theorem.                                                                             e) How is the proof completed?
      a) Forn € Z*, ann X n matrix is called a permutation ma-                    9. Let G = (V, E) bea bipartite graph with V partitioned as
   trix if there is exactly one | in each row and column, and all                 X UY.If E' C E, and E’ determines a complete matching of
   other entries are 0. How many 5 X 5 permutation matrices                       X into Y, what property do the vertices determined by E’ in
   are there? How many n X n?                                                     the line graph L(G) have? [The line graph L(G) for a loop-free
   b) Ana X n matrix B is called doubly stochastic if b,, > 0                     undirected graph G is defined in Supplementary Exercise 18 of
   for all 1 <i   <n,        1 <j   <n,    and the sum      of the entries   in   Chapter 11.]
      PART

4
MODERN APPLIED
   ALGEBRA
                  14
Rings and Modular
     Arithmetic

n this fourth and final part of the text, the emphasis will be on structure again as we
                      begin the investigation of sets of elements that are closed under two binary operations.
                    The concepts of structure and enumeration often reinforce each other. Here we will see this
                    occur as ideas seen in Chapters 1, 4, 5, and 8 come to the forefront again.
                       When we examined the set Z in Chapter 4, it was in conjunction with the closed binary
                    operations of addition and multiplication. In this chapter we emphasize these operations by
                    writing (Z, +, +), instead of just Z. Patterned after some of the properties of (Z, +, -), the
                    algebraic structure called a ring will be defined. Without knowing it, we have been dealing
                    with rings in many mathematical settings. Now we shall be concerned with finite rings that
                    arise in number theory and computer science applications. Of particular interest in the study
                    of computer science 1s the hashing function, which we find provides a means of identifying
                    records stored in a table.

14.1
          The Ring Structure:
       Definition and Examples
                    We start by defining the ring structure, realizing as we do that most abstract definitions, like
                    theorems, come about from a study of many examples where one recognizes the common
                    idea or ideas present in what may seem to be a collection of unrelated objects.

Definition 14.1     Let R be a nonempty set on which we have two closed binary operations, denoted by +
                    and - (which may be quite different from the ordinary addition and multiplication to which
                    we are accustomed). Then (R, +, -) is a ring if for alla, b, c € R, the following conditions
                    are satisfied:

a)   a+b=b+a                                  Commutative Law of +
                         b)   a+(6+c)=(a+b))+e                          Associative Law of +
                         c)   There exists z € R such that             Existence of an identity for +
                              a+z=z+a=aforeveryae             R.
                         d)   For eacha   € R there is an element       Existence of inverses under +
                              be Rwithha+b=—b+a=z.
                         e)   a-(b-c) =(a-b)-c                         Associative Law of -

673
674        Chapter 14 Rings and Modular Arithmetic

f)       a-(b+c)=a-b+a-c                                                  Distributive Laws of - over+
                                         (b+c)-a=b-at+c-a

Since the closed binary operations of + (ring addition) and - (ring multiplication) are
                            both associative, no ambiguity will arise if we write a+ b+c        for either (a + b) +c or
                            a+(b+c), ora-b-c for either (a: b)-c or a: (b-c). When dealing with the (closed)
                            binary operation of ring multiplication, we shall often write ab for a - b. In addition, we
                            can extend the associative laws (given in the definition of a ring) as we did in Exercises 8
                            and 9 of Section 4.2. Using mathematical induction, it can be verified that for all r, n € Z*,
                            withn >3and1<r<nz,

(a) +g         +++    + +4)       4+ (Grq1    tes    tan)   = a) tan           +++)   +4,       +44)     +++    +a,

and

(Qjd2       + - Ap )(Ap41 + An) = A1G2
                                                                                                 +++ ApAryy + An,
                            where @|, @2,..., Gr, Gp41, +++, Ay are elements ofa given ring (R, +, +). Inacorrespond-
                            ing way, the distributive laws generalize as follows:

a(b, + by + +--+
                                                                            by) = aby + aby +--+
                                                                                              + aby,
                                                              (b} tbo +---+6,)a=bia+hat+---+b,a,

for arbitrary ring elements a, b), bx, ..., b, and all n € Z* where n > 3.

In the next section we shall learn that the additive identity (or zero element) is unique,
                            as is the additive inverse of each ring element. For now, let us consider some examples of
                            rings.

EXAMPLE   14.1        Under the (closed) binary operations of ordinary addition and multiplication, we find that
                   :        Z, Q, R, and C are rings. In all of these rings the additive identity z is the integer 0, and
                            the additive inverse of each number x is the familiar —x.

EXAMPLE   14.2        Let M2(Z)               denote   the set of all 2 X 2 matrices                  with integer entries.          [The   sets M>(Q),
                   .        M2(R), and M2(C) are defined similarly.] In M@2(Z) two matrices are equal if their corre-
                            sponding entries are equal in Z.
                                     Here we define + and - by

a        b    4|¢             f\|_|late               b+f                 a     b\le       f|_|ae+bg                    af+bh
                            c        d          g         ih    c+g               d+h|’               c     d\|ig      h    ce+dg                   cf+dh|-

;                  .           ;          ;                   0     0                   Lo
                            Under these (closed) binary operations, M2(Z) is aring. Here z =                                     0     0     and the additive

verseof}
                            inver    p}?                  |S]   _
                                                          P)i,|-4           mdag |:

A few things happen here, however, that do not occur in the rings of Example 14.1. For
                            example,                  : j :                       [3 V['°
                                                                                                                 lab        0 [ i
                                                              14.1. The Ring Structure: Definition and Examples       675

shows that multiplication need not be commutative in a ring. That is why there are two
                  distributive laws. Also,

1    —-1]//2       1     _     0   0
                                                   —]          1};/2       1           0   O|°

even though neither        |   71]        nor 5      1       is the additive identity. Hence a ring may

contain what are called proper divisors of zero —that is, nonzero elements whose product
                  is the zero element of the ring.

We extend our study of the ring structure in the following.

Definition 14.2   Let (R, +, +) be a ring.

a) If ab = ba for alla, b € R, then R is called a commutative ring.
                    b) The ring R is said to have no proper divisors of zeroifforalla,be Rab=z>a=z
                       or b = z.
                     c) If an element u € R is such that u # z and au = ua =a for alla € R, we call ua
                        unity, or multiplicative identity, of R. Here R              is called a ring with unity.

It follows from part (c) of Definition 14.2 that whenever we have a ring R with unity,
                  then R contains at least two elements. Furthermore, if a ring has a unity we shall learn in
                  the next section that it is unique.
                      The rings in Example 14.1 are all commutative rings whose unity is the integer 1. None of
                  these rings has any proper divisors of zero. Meanwhile, the ring />(Z) is anoncommutative
                    ;             1              _    1 0       ,                 ;          _
                  ring whose unity is the matrix           | This ring does contain proper divisors of zero.
                                                      0  1
                      Also, whenever we want to verify that a particular structure (R, +, +) is a ring, we can
                  start by showing that F is closed under both binary operations. Then we can continue and
                  verify conditions (a)—(e) of Definition 14.1. Before we try to establish the distributive laws,
                  however, we might want to first determine if the multiplication operation is commutative.
                  Should we find this operation to be commutative, then we need only establish one of the
                  distributive laws (for the other will follow automatically). Further, if we are able to verify
                  all of the preceding conditions, then we’ll know that (R, +, -) is not just a ring, but a
                  connnutative ring.
                      Now let-us study another example as we further investigate the ideas set down in Defi-
                  nitions 14.1 and 14.2.

Consider the set Z together with the binary operations of @ and ©, which are defined by
EXAMPLE 14.3
                                         x®y=x+y-l,                            XxOy=x+y-xy.

Consequently, here we find, for instance, that3 67 =3+7-—1=9and3O7=3+4+7-
                  3-7=-Ill1.
                     Since ordinary addition, subtraction, and multiplication are closed binary operations for
                  Z, these new binary operations   — namely, 6 and © —are also closed for Z. In fact, we
                  shall find that (Z, @, ©) is a ring.
676        Chapter 14 Rings and Modular Arithmetic

a) In order to verify that (Z, @, ©) is aring we must establish the six conditions given in
                                 Definition 14.1. We shall examine three of these conditions and leave the other three
                                 for the Section Exercises.

1) First, since ordinary addition is a commutative binary operation for Z, we find that
                                     for all x, y € Z,

x@®y=xt+y-l=yt+x-l=y@x.

So the binary operation © is also commutative for Z.
                                  2) When    we examine condition (c) we realize that we need to find an integer z such
                                     that a @z = z@a = a, for every a in Z. Therefore, we must solve the equation
                                     a+z-— 1 =a, which leads us to z = 1. Hence the nonzero integer 1 is the zero
                                     element (or additive identity) for ®.
                                  3) What about additive inverses? At this point if we are given an (arbitrary) integer
                                     a, we want to know          if there is an integer b such thata 6b        = b @a      =z.   From
                                     part (2) above and the definition of @ this says that the integer b must satisfy
                                     a+b—1=1,and                it follows that b = 2 — a. So, for instance, the additive inverse
                                     of 7 is 2 — 7 = —5 and the additive inverse for —42 is 2 — (—42) = 44. After all,
                                     in the case of 7 we find that   7 6 (—5) = 7+ (—5) —1=7-—5-—1= 1, where1
                                     is the additive identity. [Note: Since we showed in part (1) that 6 is commutative,
                                     we also know that (—5) @7 = 1.]

b) Furthermore,        the ring   (Z, ®, ©)   also possesses   the additional   properties     given in
                                  Definition 14.2. In particular this ring has a unity (that is, a multiplicative identity). To
                                  determine the unity, let a be any integer and consider the element u (4 z = 1) where

aQu=uOQOad=a.

Sincea Ou     =a+u— au, we solvea + u — au = ato find that u(1 — a) = 0. Since
                                  a 1s arbitrary, this must hold even when a # 1. Consequently, the integer u = 0 is the
                                  unity for the ring (Z, ®, ©).

After these examples of infinite rings, we turn now to rings with finitely many elements.

Y=               — op(o                                   R by
      EXAMPLE   14.4        Let U = {1, 2} and R = PU). Define + and - on the elements of
                                                 A+         B=AAB= {x|x €Aorx             € B, but not both}
                                                 A-B=AQB              = the intersection of sets A, B CU.

We form Tables 14.1(a) and (b) for these operations.
                               From results in Chapter 3, one finds that R satisfies conditions (a), (b), (e), and (f) of
                            Definition 14.1 for these (closed) binary operations of “addition” and “multiplication.” The
                            table for “addition” shows that 4 is the additive identity. For each x € R, the additive inverse
                            of x is x itself. The multiplication table is symmetric about the diagonal from the upper left
                            to the lower right, so the operation described by the table is commutative. The table also
                            indicates that R has unity U. So R is a finite commutative ring with unity. The elements
                            {1}, {2} provide an example of proper divisors of zero.
                                                                            14.1.         The Ring Structure: Definition and Examples                       677

Table 14.1

+(A) |            @       {1}       {2}               CU                   “(9)      |B       {1}       2}              U
                                Z               J       {1}       {2}               U                      b       J         A         A               0
                               {1}           {1}         0        ou                {2}                  {1}   |B           {1}        b              {1}
                               {2}           {2}         WU        b                {1}                  {2} | @             v        {2}             {2}
                               OU               U       {2}       {1}                h                    OU       J        {1}       {2}              YU
                        (a)                                                                        (b)

EXAMPLE    14.5   For R = {a, b, c, d, e} we define + and - by Tables 14.2(a) and (b).

Table 14.2

+     fa       b         c        d            e                          a        b    e    d              e
                                     ala            b         7        d            e                  ala              a    a    a              a
                                     b     |b       c         d         e           a                  bia              b    c    d              e
                                     cle            d         é        a            b                  c|4              c    €    b              d
                                     d\d            :         a        b            c                  dja              d    b    e              Cc
                                     ele            a         bc                    d                  e|@              e    dc             6b
                               (a)                                                               (b)

Although we do not verify them here, the 125 equalities needed to establish each of
                  the associative laws and the distributive laws all hold, so (R, +, -) turns out to be a finite
                  commutative ring with unity, and it has no proper divisors of zero. The element a is the zero
                  (that is, the additive identity) of R, whereas b is the unity. Here every nonzero element x has a
                  multiplicative inverse y, where xy = yx = b, the unity. Elements c and d are multiplicative
                  inverses of each other; b is its own inverse, as Is e.

We now consider the concept of a multiplicative inverse for a ring element in general.

Definition 14.3   Let R be a ring with unity u. Ifa © R and there exists b € R such that ab = ba = u, then
                  b is called a multiplicative inverse of a and a is called a unit of R. (The element b is also a
                  unit of R.)

In Section 14.2 we shall see that if a ring element does have a multiplicative inverse,
                  then it has only one such inverse. In the meantime, we’ll examine two special kinds of ring
                  structures.

Definition 14.4   Let R be a commutative ring with unity. Then
                    a) R is called an integral domain if R has no proper divisors of zero.
                    b) R is called a field if every nonzero element of R is a unit.
678            Chapter 14 Rings and Modular Arithmetic

The ring (Z, +, +) is an integral domain but not a field, while Q, R, C, under ordinary
                                addition and multiplication, are both integral domains and fields. The ring in Example 14.5
                                is both an integral domain and a field.
                                      It follows from part (c) of Definition 14.2 that if R is an integral domain or a field, then
                                |R| > 2.

For our last ring of this section we let R = {s, tf, v, w, x, y} and + and + are given by
      EXAMPLE 14.6
                                Tables 14.3(a) and(b).

Table 14.3

+            t     v     w      Xx               y                                5       t          v         w       Xx         y

Ss    Ss     t     v     w      x                y                      S|   Ss           S      Ss            Ss      Ss         Ss
                                        t   t     v      w     Xx      y               S                      f   js            f          v         w        x         y
                                       Vv   v     w      x      y     Ss               t                      v | s             v          x         Ss       v         x
                                      w | w       x      y      Ss     t                v                     wshs             w           5         w       Ss         Ww
                                      x | x       y      S       t     v               Ww                     x | 8            x           v         S        x          v
                                      y |   y     Ss     t      v      Ww              x                      y|s               y          Xx        w        v         t

(a)                                                                     (b)

From these tables we see that (R, +, «) is acommutative ring with unity, but it is neither
                                an integra! domain nor a field. The element tf is the unity, and f and y are the units of R.
                                      We also note that vv = vy, and even though v                            is not the zero element of R, we cannot
                                cancel and say that v = y. So a general ring does not satisfy the cancellation law of mul-
                                tiplication that we may sometimes take for granted. We shall look at this idea again in the
                                next section.

5. Consider the set Z together with the binary operations ®
                        EXERCISES 14.1                                and © given in Example 14.3. (a) Verify the associative laws
                                                                      for ® and © and the distributive laws in order to complete the
1. Find the additive inverse for each element in the rings of
                                                                      work started in part (a) of Example 14.3. [This now establishes
Examples 14.5 and 14.6.                                               that (Z, @, ©) is a ring.] (b) Is this ring commutative? (c) In
2. Determine whether or not each of the following sets of            part (b) of Example 14.3 we showed that 0 is the unity for
numbers is a ring under ordinary addition and multiplication.         (Z, @, ©). What are the units for this ring? (d) Is this ring an
                                                                      integral domain? a field?
      a) R = the set of positive integers and zero
      b) R = {kn|n € Z, k a fixed positive integer}                         6. Define the binary operations @ and © on Z by x Py =
                                                                             Xx+y—-7,xOy=x+y—3xy, forall x, y ¢ Z. Explain why
      c) R= {a+bV2\a,beZ}                                              (Z, @, ©) is not aring.
      d) R = {a+bV24+cV3|a€Z,
                         b,c EQ}
                                                                            7.   Let        k, m   be     fixed         integers.   Find        all values   for k, m        for
  3. Let (R, +, -) be aring with a, b, c, d elements of R. State      which (Z, 6, ©) is a ring under the binary operations x @ y =
the conditions (from the definition of a ring) that are needed to     x+y-—k,xOy=x+y—-—mxy,              where x, y € Z.
prove each of the following results.
                                                                        8. Tables 14.4(a) and (b) make (R, +, -) into a ring, where
      a) (a+b)+c=b+(c+a)
                                                                      R = {s, t, x, y}. (a) What is the zero for this ring? (b) What is
      b) d+a(b+c)=ab+           (d+ac)                                the additive inverse of each element? (c) What is f(s + xy)?
      ce) c(d+b)+ab=(a+c)b+cd                                         (d) Is R acommutative ring? (e) Does R have a unity? (f) Find
      d) a(be) + (ab)d = (ab)(d +0)                                   a pair of zero divisors.

4. For the set R in Example 14.4, keep A- B = AM B, but               9, Define addition and multiplication, denoted by @ and ©,
define A+ B = AUB, Is (R, U, M) aring?                                respectively, on the set Q as follows. Fora,b€Q,a@b=
                                                                                                                                14.2. Ring Properties and Substructures                           679

Table 14.4

+ |] 8                t      x         y                           Ss       t        x         y
                                                                                                                         Ba[e alo[e alls a]>[o a]
                                                                                                                      b) Show that        ;       :    is a unit in the ring /@>(Q) but nota
      so}       y          x       Ss            t                 s.y           y         x         x
      t             x      y       t             §                 t     y       y         XxX       x                unit in M> (Z).
      x | Ss               t       x         y                     x | x         Xx        x         x                a        b                             a
      y | t                s       y         x                     y   |x        Xx        x         x       13. If |“         A       € M,(R), prove that |“                 A       is a unit of this

(a)                                                          (b)                                             ring if and only if ad — bc # 0.
                                                                                                             14. Give an example of a ring with eight elements. How about
at+b+7,a©Qb=a+b-+4                                   (ab/7). (a) Prove that (Q, 8, ©)                       one with 16 elements? Generalize.
is a ring. (b) Is this ring commutative? (c) Does the ring have a                                           15. For R = {s, t, x, y}, define + and -, making R into a ring,
unity? What about units? (d) Is this ring an integral domain? a                                             by Table 14.5(a) for + and by the partial table for - in Table
field?                                                                                                       14.5(b).
10. Let (Q, 6, ©)                      denote        the     field      where        @   and    ©   are
                                                                                                              Table 14.5
defined by
                                                                                                                  +/5              t          x       y                   S       f         x      y
             ageb=a+b-—k,                              aQb=a+b+ (ab/m),
for fixed elements k, m (# 0) of Q. Determine the value for k                                                    s | s            t          x       y              s | s        s          s     s
and the value for m in each of the following.                                                                     t | ¢t          S          y       x              t | s        t         2      ?
            a) The zero element for the field is 3.                                                               x | x            y          S        t            x | 8s        t         2      y
                                                                                                                  y      ly        x          t       s             y|s           ?         s      7
            b) The additive inverse of the element 6 is —9.
            c) The multiplicative inverse of 2 is 1/8.                                                      (a)                                               (b)
ll. Let R = {a+ bila,b eZ, it =—1}, with addition and                                                                 a) Using the associative and distributive laws, determine
multiplication defined by (a+ 6i)+(e+di)=(ate)+                                                                       the entries for the missing spaces in the multiplication table.
(b+ d)i and (a + bi)(e + di) = (ac — bd) + (bc + ad)i, re-
                                                                                                                      b) Is this ring commutative?
spectively. (a) Verify that R is an integral domain. (b) Deter-
mine all units in R.                                                                                                  c) Does it have a unity? How about units?
                                                                                                                                 .      .             .             9
12. a) Determine                 the     multiplicative               inverse        of the     matrix               d) Is the ring an integral domain or a field?

3                            — that is, find a, b, c, d so that
                        | in the ring M3(Z)

14.2
                Ring Properties and Substructures
                                                       In each ring of Section 14.1 we were concerned with the zero element of the ring and the
                                                       additive inverse of each ring element. It is time now to show, along with other properties,
                                                       that these elements are truly unique.

THEOREM                   14.1                        In any ring (R, +, °),
                                                            a) the zero element z is unique, and
                                                            b) the additive inverse of each ring element is unique.

Proof:
                                                            a) If R has more than one additive identity, let z}, z2 denote two such elements. Then

Z)=     2,422
                                                                                                                                   = 22.
                                                                                                                          ;                   \
                                                                                                              Since z, is an              Since z, is an
                                                                                                           additive identity              additive identity
680      Chapter 14 Rings and Modular Arithmetic

b) For a € R, suppose there are two elements b,c € R where a+b =b+a =z and
                               ate=c+a=z. Thnb=b4+z2=64                             (a+c)=(b+a)+c=z+c
                                                                                                   =c. (The
                               reader should supply the condition that establishes each equality.)

As aresult of the uniqueness in part (b), from this point on we shall denote the additive
                          inverse of a € R by —a. Further, we may now speak of subtraction in the ring, where we
                          understand that a —b =a+(—6).
                             From Theorem 14.1(b) we also obtain the following for any ring R.

THEOREM 14.2              The Cancellation Laws of Addition. For all a, b, c € R,

aja+b=a+c>5b=c,and
                            b)b+a=ct+asb=c.

Proof:
                             a) Since a € R, it follows that —a     € R and we have

at+tb=a+ecx>(-a)+          (a+b) = (-a)+ (atc)
                                                                    => [(-a)+a]+b=[(-a)+a]+ec
                                                                    =z+tb=z+c>b=c.
                            b) We leave this similar proof for the reader.

Note that when we examine the addition table for a finite ring we find that each element
                          of the ring occurs exactly one time in each row and column of the table. This is a direct
                          consequence of Theorem 14.2 — where part (a) handles the rows and part (b) the columns.

THEOREM 14.3              For any ring (R, +, -) and any a € R, we have az = za = z.
                          Proof: If a <¢ R, then az =a(z+z) because z+z=z. Hence              7+ az =az=az+az.
                          (Why?) Using the cancellation law of addition, we have z = az.
                              The proof that za = z is done similarly.

The reader may feel that the result of Theorem 14.3 is obvious. But we are not dealing
                          with just Z or Q or M2(Z). Our objective is to show that any ring satisfies such a result,
                          and to get the result we may only use the conditions in the definition of a ring and whatever
                          properties we’ ve derived for arbitrary rings up to this point.

The uniqueness of additive inverses [from part (b) of Theorem 14.1] now implies the
                          following result.

THEOREM 14.4              Given aring (R, +, -), foralla, be        R,

a) —(—a) =a,
                            b)     a(—b)   = (—a)b   = —(ab), and
                            c)     (—a)(—b) = ab.
                                                                  14.2. Ring Properties and Substructures         681

Proof:

a) By the convention stated after Theorem 14.1, —(—a) denotes the additive inverse of
                    —a. Since (—a) + a = z, a is also an additive inverse for —a. Consequently, by the
                        uniqueness of such inverses, —(—a@)    = a.
                 b) We shall prove that a(—b) = —(ab) and shall leave the other part for the reader.
                    We know that —(ab) denotes the additive inverse of ab. However, ab + a(—b) =
                        al(b + (—b)] = az = z, by Theorem      14.3, so by the uniqueness of additive inverses,
                        a(—b) = —(ab).
                 c) Here we establish an idea we have used in algebra since our first encounter with signed
                    numbers. “Minus times minus does indeed equal plus,” and the proof follows from the
                    properties and definition of a ring. From part (b) we have (—a)(—b) = —[a(—b)] =
                    —[—(ab)], and the result then follows from part (a).

For the operation of multiplication one also finds the following, which is comparable to
               Theorem 14.1.

THEOREM 14.5   Fora ring (R, +, -),

a) if R has a unity, then it is unique, and
                 b) if RX has a unity, and x is a unit of R, then the multiplicative inverse of x is unique.

Proof: The proofs of these results are left to the reader.

As aresult of this theorem, when (R, +, -) is a ring with unity, we shall denote the unity
               by u. Furthermore, in such a ring the multiplicative inverse of each unit x will be denoted
               by x~'. Also, one may now restate the definition of a field as a commutative ring F with
               unity, such that for allx e F,x #z>x7' EF.
                   With this notion to assist us, we examine some further properties and relations between
               fields and integral domains.

THEOREM 14.6   Let (R, +, -) be acommutative ring with unity. Then R is an integral domain if and only if,
               for all a, b, c € R where a # z, ab = ac => b = c. (Hence, a commutative ring with unity
               that satisfies the cancellation law of multiplication is an integral domain.)
               Proof: If R is an integral domain and x, y € R, then xy =z>                x =z or y =z. Now          if
               ab = ac, then ab — ac = a(b — c) = z, and because a # z, it follows that b —c                   = z or
               b = c. Conversely, if R is commutative with unity and R satisfies multiplicative cancella-
               tion, then leta, b € R withab     = z. Ifa = z, weare finished. If not, as az = z, we can write
               ab = az andconclude that b = z. So there are no proper divisors of zero and R is an integral
               domain.

Before going on, let us realize that the cancellation law of multiplication does not imply
               the existence of multiplicative inverses. The integral domain (Z, +, -) satisfies multiplica-
               tive cancellation, but it contains only two elements— namely,            1 and —1 —that      are units.
               Hence, an integral domain need not be a field. But what about a field? Is it necessarily an
               integral domain?
682          Chapter 14 Rings and Modular Arithmetic

THEOREM 14.7                  If (F, +, -) is a field, then it is an integral domain.
                              Proof: Let a, b € F with ab = z. If a =z, we are finished. If not, a has a multiplicative
                              inverse a~! because F is a field. Then

ab=z>a (ab) =a'z3 (a 'a)b=a'zSub=z>b=2z.
                              Hence F has no proper divisors of zero and is an integral domain.

In Chapter 5 we found that functions f: A > A could be one-to-one (or onto) without
                              being onto (or one-to-one). However, if A were finite, such a function f was one-to-one if
                              and only if it was onto. (See Theorem 5.11.) The same situation occurs with finite integral
                              domains. An integral domain need not be a field, but when it is finite we find that this
                              structure is a field.

THEOREM 14.8                 A finite integral domain (D, +, -) is a field.
                              Proof: Since D   is finite, we may list the elements of D as {d|, dz, ..., d,}. Ford € D, where
                              d #z,wehavedD            = {dd,, ddz, ..., dd,} © D because D is closed under multiplication.
                              Now |D| =n and dD C D, so if we could show that dD contains n elements, we would
                              have dD = D. If |dD| <n, then dd, = dd,, for some 1 <i < j <a. But since D is an
                              integral domain and d # z, we have d, = d;, when they are supposed to be distinct. So
                              dD    = D and for some      1 <k   <n, dd   = u, the unity of D. Then dd,   = u => d   is a unit of
                              D, and since d was chosen arbitrarily, it follows that (D, +, +) is a field.

From the proof of Theorem 14.8 we also realize that when we are dealing with the non-
                              zero elements of a finite field, the multiplication table for these elements is such that each
                              element of the field occurs exactly once in each of the rows and columns.
                                  In the next section we shall look at finite fields that are useful in discrete mathematics.
                              Before closing this section, however, let us examine some special subsets of a ring.
                                 When we were dealing with finite state machines in Chapter 6, we saw instances where
                              subsets of the set of internal states gave rise to machines on their own (when the next state
                              and output functions of the original machines were suitably restricted). These were called
                              submachines. Since closed binary operations are special kinds of functions, we encounter
                              a similar idea in the following definition.

Definition 14.5         For a ring (R, +, -), anonempty subset S of R is called a subring of R if (S, +, -)— that
                              is, S under the addition and multiplication of R, restricted to § — is a ring.

For every ring R, the subsets {z} and R are always subrings of R.
      EXAMPLE 14.7

a) The set of all even integers is a subring of (Z, +, -). In fact, for each n € Z*, nZ =
      EXAMPLE 14.8
                                     {nx|x € Z} is a subring of (Z, +, -).
                                b) (Z, +, -) is a subring of (Q, +, +), which is a subring of (R, +, -), which is a subring
                                   of (C, 4+, +).
                                                                       14.2   Ring Properties and Substructures      683

In Example        14.6, the subsets $ = {s, w} and T = {s, v, x} are subrings of R.
   EXAMPLE 14.9

The next result characterizes those subsets of a ring that are subrings.

THEOREM 14.9      Given a ring (RX, +, -), a nonempty subset S of R is a subring of R if and only if

1) foralla, b € S, we havea + b,ab € S (that is, § is closed under the binary operations
                        of addition and multiplication defined on R), and
                     2) for all a € S, we have —a € S.

Proof: If (5, +, -) is a subring of R, then in its own right it satisfies all the conditions of a
                  ring. Hence it satisfies conditions 1 and 2 of the theorem. Conversely, let S be a nonempty
                  subset of F that satisfies conditions 1 and 2. Conditions (a), (b), (e), and (f) of the definition
                  of a ring are inherited by the elements of S$, because they are also elements of R. Thus, all
                  we need to verify here is that S has an additive identity. Now S # 4, so there is an element
                  a € S, and by condition 2, - € S. Then by condition 1, z = a+ (—a)€S.

Consider the ring (Z, @, ©) that we examined in Example 14.3 and Exercise 5 of Section
  EXAMPLE 14.10
                  14.1. Here we have x @ y=x+y-—landx © y=x+y-—xy. Now consider the subset
                  S={...,—-5,         —3,   -1,   1, 3,5, ...} of all odd integers. Since, for example, 3 and 5 are in
                  S but the ordinary sum 3 + 5 = 8 ¢ S, this set §$ is not a subring of (Z, +, -). However,
                  365=3+5-1=7€ES. In fact, for all a,b Ee S we have a@Gb=a-+b—1, where
                  a+b is even, anda+b-—1 is odd—soa@besS. Also, aOb=a+b-—ab, where
                  a+b is even and ab is odd—soa ObeE S. Finally, —a [the additive inverse of a in the
                  ring (Z, @, ©)] is equal to 2 — a, which is odd whenever a is odd. Consequently, if a € S
                  then —a € S, and it follows from Theorem           14.9 that S is a subring of (Z, @, ©).

Note that (Z*, +, -) satisfies condition        1 in Theorem     14.9, but not condition 2, so it is
                  not a subring of (Z, +, -).
                     The result in Theorem 14.9 can also be given as follows.

THEOREM 14.10     For any ring (R, +, -), if ASCR,

a) then (S, +, -) is a subring of R if and only if for all a, b € S, we have a — b € S and
                       abe     S;
                    b) and if S is finite, then (S, +, -) 1s a subring of R if and only if for all a, b € S, we have
                       a+b, ab €S. (Once again, additional help comes from a finiteness condition.)
                  Proof: These proofs we leave for the reader.

The next example demonstrates how one might use the first part of the preceding theorem.

Let us consider the ring R = M2(Z) and the subset
  EXAMPLE 14.11
                                                              x      x+y
                                                                                Xx, vez!
                                                            x+y        Xx
684          Chapter 14    Rings and Modular Arithmetic

0
                                of R. When x = y = Oit follows that      | € S,and S 4 %. So now we examine any two
                                                                     0 0
                                elements of S — namely, two matrices of the form
                                                                     x        x+y             and          v             v+w
                                                                 x+y           x                       v+w                v       ,
                               where x, y, v, w € Z. We find that

x        x+y]               v         v+w]       _              x—v                      (x -—v) +(y-w)
                                      x+y         x               v+w           v        la         -v4t(iy-w)                          x—vU                  ,
                                so S$ is closed under subtraction. Turning to multiplication we have
                                        x       x+y              v         v+w
                                      x+y         x            v+w           v

_f|xut(rt+y\u+w)                         x(v+w)t(xt+y)v
                                     (x+y)jvtx(iv+tw)                      («t+y)\(vtw)txv

_|xv         +xu+yutxw+                yw           Xv+txw+xut                   yv
                                               xXu+tyvtxu+xw                       XU + yu+txw+ywrrv

_          xu+xutyutxw+ yw                                         (xv +xu+ yu+txw+ yw) +(-yw)
                                        (xu +xv+ yu+txw+t yw)4+ (—-yw)                                    xvu+txvu+yutxw+ yw
                                so S is also closed under multiplication.
                                      Appealing now to part (a) of Theorem 14.10, one finds that S is a subring of R.

We shall now single out an important type of subring.

Definition 14.6           Anonempty subset / ofa ring R is called an ideal of R if for alla, b € J and allr € R, we
                                have (a)a —be TJ and(b)ar,rae Tl.

An ideal is a subring, but the converse does not necessarily hold: (Z, +, +) is a sub-
                                ring of (Q, +, -) but not an ideal because, for example,                            (1/2)9 ¢ Z for (1/2) € Q, 9 €Z.
                                Meanwhile, all the subrings in Example 14.8(a) are ideals of (Z, +, -).
                                      Looking back to Example              14.10 we see that ifa ¢ S,x € Z, thena Ox                           =~a+x-—ax
                                (= x © a), and if x is even (because the case for x odd has already been covered within
                                Example 14.10), then a + x is odd and ax is even, making a + x — ax odd. Consequently,
                                for alla € S andallx            € Z,a@©x       andx ©a          arein S, so S is an ideal of the ring (Z, @, ©).

and B~'A~!      if
                          EXERCISES     14.2                                                         Az    k        i]            R=   E        |
                                                                                                               1    2                      2    1
1. Complete the proofs      of Theorems       14.2,   14.4,   14.5, and
14.10.                                                                             4. Prove that a unit in a ring R cannot be a proper divisor of

2. If a, b, and c are any elements in a ring (R, +, +), prove                  ZE10.
that (a) a(b — c) = ab — (ac) = ab — ac and (b) (b — c)a =                         5. Ifa is a unit in ring R, prove that —a is also a unit in RX.
ba — (ca) = ba — ca.
                                                                                   6. a) Verify that the subsets              S$ = {s, w} and       T = {s, v, x}
3. a) If R is a ring with unity and a, b are units of R, prove                       are subrings of the ring R in Example 14.6. (The binary
    that ab is a unit of R and that (ab)"! = b'a7!.                                   operations     for the elements          of S, 7 are those given in
      b) For the ring M2(Z), find A~', B-', (AB)~', (BA),                             Table 14.3.)
                                                                                                    14.2. Ring Properties and Substructures            685

b) Are the subrings in part (a) ideals of R?                                     a) Verify that R is a field,
7, Let S and T be subrings of aring R. Prove that $M T isa                             b) Find a subring of R that is not an ideal.
subring of R.                                                                           c) Let x and y be unknowns, Solve the following system
8. Let R = M,(Z) and let S be the subset of R where                                    of linear equations in R: bx + y =u; x + by =z.

s={[.2, “5”
                                                                                    17, Let R be a commutative ring with unity uw.
                                                         nye   Zl
                                                                                        a) For any (fixed)a € R, prove thataR = {ar|r € R}is an
Prove that S is a subring of R.                                                         ideal of R.
9, Let (R, +, -) be a ring. If S, 7,, and 7> are subrings of R,                        b) If the only ideals of R are {z} and R, prove that R is a
and S$ C 7, U Th, prove that S C 7; or $ CT).                                           field.
10. a) Let (R, +, -) be a finite commutative ring with unity uw.                    18. Let (S, +, +) and (7, +’, «’) be two rings. ForR = S X T,
       Ifr € R and r is not the zero element of R, prove that r is                  define addition “@” and multiplication “©” by
       either a unit or a proper divisor of zero.
                                                                                                 (1, t1) B G2, 2) = (81 + 52,         +b),
       b) Does the result in part (a) remain valid when R                  is in-
       finite?                                                                                   (S1, t1) © (Sa, fa) = (81 + 82, fy" t).
11. a) For R = M>(Z), prove that                                                        a) Prove that under these closed binary operations, R is a

[oo
                                                                                        ring.
                               S=                    aez|                               b) If both S and T are commutative, prove that R is com-
                                         0   0
                                                                                        mutative.
       is a subring of R.
                                                                                        c) If S has unity us and T has unity wy, what is the unity
       b) What is the unity of R?
                                                                                        of R?
        c) Does § have a unity?
                                                                                        d) If S and T     are fields, is R also a field?
       d) Does S have any properties that R does not have?
                                                                                    19, Let (R, +, +) be a ring with unity uw, and |R| = 8. On
        e) Is S an ideal of R?                                                      R*=RX RX RX R, define + and - as suggested by Exer-
12. Let Sand T be the following subsets of the ring R = M2(Z):                      cise 18. In the ring R*, (a) how many elements have exactly two

ales
                                                                                    nonzero components? (b) how many elements have all nonzero
                       a,b,ce z}         ;                                          components? (c) is there a unity? (d) how many units are there

{ft
                                                                                    if R has four units?

abcd      eZ}.      20. Let (R, +, +) bearing, witha € R. Define 0a = z, la =a,
                                                                                    and (n + l)a = na +a, forall n € Z*. (Here we are multiply-
        a) Verify that S is a subring of 2. Is it an ideal?                         ing elements of R by elements of Z, so we have yet another
        b) Verify that T is a subring of R. Is it an ideal?                         operation that is different from the multiplications in either of
                                                                                    Z or R.) For n > 0, we define (—n)a = n(—a), so, for ex-
13. Let (R, +, -) be a commutative ring, and let z denote the
                                                                                    ample, (—3)a = 3(—a) = 2(—a) + (—a) = [(—a) + (-a)} +
zero element of R. For a fixed element a € R, define N(a) =
                                                                                    (—a) = [-(a+a4)]+ (-a) = -[(a+a) +a] =
{r € R|ra = z}. Prove that N (a) is an ideal of R.
                                                                                    — {2a +a] = —(3a).
14, Let R be a commutative ring with unity u, and let J be an                          For alla, b € R, and all m, n € Z, prove that
ideal of R. (a) If u € 7, prove that J = R. (b) If J contains a unit
                                                                                        a) ma+na=(m+n)a                    b) m(na) = (mn)a
of R, prove that J = R.
15, If R is a field, how many ideals does R have?                                       c) n(at+tb)=nat+nb                 d) n(ab) = (na)b = a(nb)

16. Let (R, +, -) be the (finite) commutative ring with unity                           e) (ma)(nb) = (mn)(ab) = (na)(mb)
given by Tables 14.6(a) and (b).                                                    21. a) For ring (R, +, -) and each a € R, we define a! = a,
                                                                                        and a"t! = a"a, for all n € Z*. Prove that for all m,n
Table 14.6                                                                              € Zt, (a")(a") = a"          and (a”)" = a™".
      + |    z    u       a         b                     Zz    u    a       b          b) Can you suggest how we might define a” or a~", n
                                                                                        € Z*, including any necessary conditions              that R   must
      Zz |   Zz   u       a         b              z |    z     Zz   Zz      Zz         satisfy for these definitions to make sense?
              u   Zz      b         a              u |    z     u    a       b
      ala         b       Zz        u              alz          a    b       u
      bib         a       u         Zz             b |    z     b    u       a

(a)                                          (b)
686          Chapter 14 Rings and Modular Arithmetic

14,3
              The Integers Modulo n
                              Enough abstraction for a while! We shall now concentrate on the construction and use of
                              special finite rings and fields.

Definition 14.7         Let n € Z*,n >          1. Fora, b € Z, we say that a is congruent to b modulo n, and we write
                              a =b        (mod n), if n|(a — b), or, equivalently, a = b + kn for some k € Z.

i) We find that 17 = 2 (mod 5), since 17 — 2 = 15 = 3(5), or 17 = 24 3(5).
      EXAMPLE 14.12
                                   ii) As —7 + 49 = —7 — (—49) = 42 = 7(6) [or, —7 = —49 + 7(6)], we have
                                       —7 = —49 (mod 6).
                                  iii)     Since 11 — (—5) = 16 = 2(8) [or, 11 = —5 + 2(8)],
                                                                                        it follows that 11 = —5 (mod 8).

Before we examine          our first theorem,   let us make   three observations about this new
                              concept of congruence modulo n. Here, as above, we have a, b, n € Z, withn               > 1.

i)    Using   the division algorithm,   we can write a = qin +17,     and b = gon + rz, with
                                           O<r, <n,O <r <n. Soa —b= (q, — q2)n4t               (7) — 12). Then, ifa = b (moda),
                                           it follows that n|(a — b), and, consequently, »|(7; — r2). But with0 < |r; — ro] <n,
                                           we now find that 7; = ro.
                                              Hence, if a = b (mod n), then a, b have the same remainder upon division by n.
                                   ii} The converse of the result in (i) is also true. That is, if@ = gjn +r; andb = gon +r,
                                       with r) = ro, then a — b = (gq; — g2)n anda =b (mod n).
                                  iii)     Althougha    = b > a =b(mod~n), we cannot expect a = b (modn) > a = b. How-
                                           ever, if a=b (modn) anda, be€ {0,1,2,...,n—1}, thena = b.

THEOREM 14.11                 Congruence modulo n is an equivalence relation on Z.
                              Proof: The proof is left for the reader.

Since an equivalence relation on a set induces a partition of that set, forn > 2, congruence
                              modulo n partitions Z into the n equivalence classes
                                               [0] ={..., —2n, —n, 0, n, 2n,...) = {O+nx|x € Z}
                                                                                              €Z}
                                               [lJ={...,-2n4+1,-n4+1,1,424+1,20241,...}={l+nx|x
                                               [2] ={...,
                                                       -2n +2, -n4+2,2,n+2,2n+2,...} ={2+nx|x € Z}

[In —1l] ={...,    -—n —1,   -l,n—1,2n—-—1,3n-—1,...}
                                                   = {(n —1)+nx|x       €Z}.
                                 For all t € Z, by the division algorithm (of Section 4.3) we can write f = gn +r, where
                              O<r<n, sot €l[r], or [tf] =[r]. We use the notation Z, to denote {[0], [1], [2],...,
                              [7 — 1]}. (When there is no danger of ambiguity, we often replace [a] by a and write
                              Z, = {0, 1, 2,...,      — 1}.) Our objective now is to define closed binary operations of
                              addition and multiplication on the set Z,, of equivalence classes so that we obtain a ring.
                                                                                               14.3. The Integers Modulon            687

For [a], [b] € Z,,, define + and - by

[a] + [b] =[a+b6]              and    [a]: [b] = [a][b] = [ab].
                   For example, ifn = 7, then [2] + [6] = [2 + 6] = [8] = [1], and [2][6] = [12] = [5].
                   Before these definitions are so readily accepted, we must investigate whether or not
                these (closed binary) operations are well-defined                         in the sense that if [a] = [c], [b] = [d],
                then [a] + [b] = [ce] + [d] and [a][b] = [c][d]. Since [a] = [c] can occur with a # c, do
                the results of our addition and multiplication depend on which representatives are chosen
                from the equivalence classes? We shall prove that the results of the two operations are
                independent of the choice of class representatives and that the operations are very definitely
                well-defined.
                   First, we observe that [a] = [c] + a =c+ sn, for some s € Z, and [b] = |d] > b=
                d+tn,      for some t € Z. Hence

at+b=(c+sn)+(d+tn)=c+d4+(s+t)n,

so (a + b) =(c +d) (mod n) and [a + b] = [c +d]. Also,

ab =(c+sn)(d +tn) =cd+(sd+ct+stn)n

and ab = cd (mod n), or [ab] = [cd].
                   This result now leads us to the following.

THEOREM 14.12   For 2 € Zt, n > 1, under the closed binary operations defined above, Z,, is a commutative
                ring with unity [1] (and additive identity [OJ).
                Proof: The proof is left to the reader. Verification of the ring properties follows from the
                definitions of addition and multiplication in Z,, and from the corresponding properties for
                the ring (Z, +, -).

Before stating any further results, let us examine two particular examples, Zs and Zo. In
                Tables 14.7(a) and (b) and 14.8(a) and (b), we simplify [a] by writing a.

Table 14.7

Zs\         + | 0         1    2         3        4                0        1      2      3     4

0 | 0         1    2         3        4           0|   0        0      0      0     0
                                        1/1          2     3         4        0            1|0          1      2      3     4
                                        2    \2      3     4         0        !           2) 0          2      4      1     3
                                        3 |3         4     0          l       2           3 | 0         3      1      4     2
                                        4 | 4        0     1         2        3           4/0           4      3      2     1
                                  (a)                                               (b)

In Z;    every        nonzero     element   has        a multiplicative         inverse,   so Zs   is a field. For Ze,
                however,     | and 5 are the only units and 2, 3, 4 are proper divisors of zero. Meanwhile, in
                Zo, 3.3 =3-6=0,                   so 3 and 6 are proper divisors of zero. Consequently, for Z,,, n > 2,
                to be a field, we need more than just an odd modulus.
688         Chapter 14 Rings and Modular Arithmetic

Table 14.8

Zo|         + | 0     1     2      3      4      5                 0         ]        2       3    4     5

0 | 0      l   2       3      4      5           0 | 0           0        0       0    0     0
                                               1 | 1     2    3       4      3      0            1 | 0          1        2       3    4     5
                                               2 | 2     3    “       5      0       1          2/0             2        4       0    2     4
                                               3 | 3     4    5       0       l     2           3.|  0          3        0       3    0     3
                                               4|4       5    0       1      2      3           4/0             4        2       0    4     2
                                               5 | 5     0     1      2      3      4           5 | 0           5        4       3    2     1
                                         (a)                                              (b)

THEOREM 14.13                Z,, is a field if and only if » is a prime.
                             Proof: Let 1 be a prime, and suppose that 0 < a < n. Then gcd(a, n) = 1, so as we learned in
                             Section 4.4 there are integers s, ¢ with as + tn = 1. Thus as = 1 (mod n), or [a][s] = [1],
                             and [a] is a unit of Z,,, which is consequently a field.
                                   Conversely, if7 is not a prime, then 7 = njn2, where                   1 < 11,2           <n. So [n,] 4 [0] and
                             [m2] 4 [0] but (7) ]L[z2] = [n1n2]           = [0], and Z, is not even an integral domain, so it cannot
                             be a field.

In Ze, [5] 1s a unit and [3] is a zero divisor. We seek a way to recognize when [a] 1s a
                             unit in Z,, for m composite.

THEOREM     14.14            In Z,,, [a] is a unit if and only if gcd(a, n) = 1.
                             Proof: If gcd(a, n) = 1, the result follows as in the proof of Theorem 14.13. For the con-
                             verse, let [a] € Z, and [a]~!          = [s]. Then [as] = [a][s] = [1], so as = 1 (modn)                     and as =
                             1+ tn, for some t € Z. But 1 = as +n(—-t) >                 ged(a,n)          =        1.

Find (25]7! in Z:.
      EXAMPLE 14.13
                                   Since gcd(25, 72) = 1, the Euclidean algorithm leads us to

72 = 2(25) + 22,             0< 22 < 25
                                                                   25 = 1(22) +3,               0<3<22
                                                                   22 = 7(3) +1,                O<1       <3.

As 1 is the last nonzero remainder, we have

| = 22 — 7(3) = 22 — 7[25 — 22] = (—7)(25) + (8)(22)
                                                        = (—7)(25) + 8[72 — 2(25)] = 8(72) — 23(25).

But

1 = 8(72) — 23(25) > 1 = (—23)(25) = (—23 + 72)(25) (mod 72),

so [1] = [49][25] and [25]~! = [49] in Z72.
                                 In addition, from this result we are now able to solve the following linear congruences
                             for x:
                                                                                14.3 The Integers Modulo n                   689

1) If 25x = 1 (mod 72), then x = 49 (mod 72).
                   2) If 25x = 3 (mod 72), then x = 49 - 3 (mod 72) = 3 (mod 72).

Now     [25] is a unit in Z72, but is there any way of knowing            how       many units this ring
                has? From Theorem         14.14, if 1 <a@ < 72, then [a]~! exists if and only if gcd(a, 72) = 1.
                Consequently, the number of units in Z72 1s the number of integers a, such that 1 <a                         < 72
                and gcd(a, 72) = 1. Using Euler’s phi function (Example 8.8), we find that this is

(72) = $(2°3*) = (72)f1 — (1/2)11 — (1/3)] = (72) (1/2)(2/3) =

In general, for anyn € Z*,n > ’ tere 6) tis and
                                                              adn                           1 $m pr
                  of zero in Z,.                                                     ‘             x

Before we continue with some examples where congruence plays a role, we want to look
                back at the binary operation mod that was introduced earlier in Examples 4.36 and 10.8. In
                those examples we considered x, y € Z* and defined x mod y as the remainder obtained
                when we divide x by y. At this point we shall extend this concept to include the case where
                x <0. Hence, forx € Zand y € Z*, x mod y is the remainder that results upon division of
                x by y.
                   But,   now,    how    is mod   related to the mod   of Definition       14.7?       Here   we    find that if
                a,b,n     € Z,withn     > 1,thena = b (modn)ifandonlyifa mod n = b mod n. (This follows
                from the observations we made prior to Theorem 14.11.)

And now the time has arrived for some additional examples.

Randomly generated numbers arise in many applications. In particular, they are often used
EXAMPLE 14.14
                for the computer simulation of experiments that are too expensive, too dangerous, or just
                plain impossible to conduct in the real world.
                    The idea of using a computer to generate random numbers was first developed by John
                von Neumann (1903-1957) in 1946. However, although these numbers may appear to be
                random, they are not — hence the title pseudorandom numbers.
                    Proposed in 1949 by Derrick H. Lehmer (1905-1991), the most commonly used tech-
                nique for generating such pseudorandom numbers employs the notion of congruence. For
                the /inear congruential generator, one starts with the four integers: the multiplier a, the
                increment c, the modulus m, and the seed xp, where

2<a<m,          O<c<m,            and            0 <x)
                                                                                             < mm.

These nonnegative integers are used to generate a sequence of pseudorandom numbers,
                X1, X2, X3,...,    recursively, by

Xn+1 = (AX, +c) mod m.

So 0 < x,4)      <m,    forn > 0. For example, ifa = 3,c = 2, m           = 11, and xp =           1, then

x; = (axg + c) mod m = [3(1) + 2] mod 11 = 5, sox; =5.

Similarly, x. = (ax; + c) mod m = [3(5) +2] mod 11 = 17 mod 11 = 6, so x2 = 6.
690         Chapter 14 Rings and Modular Arithmetic

Continuing in this manner, one finds that x3 = 9, x4 = 7, and x5 = 1, the seed. Conse-
                             quently, this linear congruential generator produces five distinct integers before repeating.
                             The sequence of pseudorandom numbers thus obtained is 1, 5, 6, 9, 7, 1, 5, 6,....
                                Witha = 3,c =5,m                      = 12, and xp = 6, we first learn thatx, = [3(6) + 5] mod 12 = 11,
                             so x; = 11. Next, x2 = [3(11) + 5] mod                       12 = 38 med       12 = 2, so x2 = 2. Further compu-
                             tation yields x3 = 11. This time the linear congruential generator yields only three dis-
                             tinct integers before repeating. The sequence of pseudorandom numbers generated here is
                             6, 11, 2, 11, 2, 11, 2,..., where the seed is not repeated.
                                 In practice large values for a and m are used— especially for critical simulations. For
                             a = 16,807 (= 7°), c = 0, m = 2,147,483,647 (= 2°! — 1, a prime), and x9 = 1, one ob-
                             tains a sequence       of 2,147,483,647                   pseudorandom        numbers     before   a repeated integer
                             appears.

a) Whether it’s youngsters using decoder rings or military leaders sending battle plans
      EXAMPLE 14.15
                                  to troops, throughout history, various people have wanted to keep certain information
                                  unintelligible, should it fall into the wrong hands.
                                     As early as the first century B.C., the Roman general Gaius Julius Caesar (100 B.c—
                                  44 B.C.) used a cipher shift to make the contents of certain messages understandable
                                  only for those he intended the messages to reach. To describe this early form of cryp-
                                  tosystem— often termed the Caesar cipher— we shall make certain conventions to
                                   simplify the presentation. First, we shall write the original message, the plaintext,
                                   using only lowercase letters, with no punctuation or spaces. Then to encrypt the plain-
                                   text, each lowercase letter, from a to w, is shifted to the letter three places forward in
                                   the alphabet, and the last three letters — namely, x, y, and z —are shifted to the first
                                   three letters, respectively. We use the uppercase letters for the resulting ciphertext.
                                   Consequently, a is encrypted as D,bas E,cas F,..., jasM,...,masP,...,y
                                   as B,andzasC.
                                       If Caesar wanted to inform a senator in Rome of a recent victory, he might have
                                   sent the message “I came, I saw, I conquered.” Encryption of this message takes place
                                   as follows:
                                    Plaintext         i       c        am      eis             awioconguered
                                    Ciphertext      L         F        DPHAHALV                DZLFRQTxX
                                                                                                      HUEHAG
                                   Upon receiving the ciphertext, as long as this senator knows the size and direction
                                   of the shift, he can reverse the process. Decryption then results by replacing each
                                   uppercase letter, from D to Z, in the ciphertext by the lowercase letter three places
                                   back in the alphabet, and A by x, B by y, and C by z. After decrypting, one then
                                   inserts the appropriate spaces and punctuation in the plaintext. (Note that by removing
                                   spaces in the plaintext, the resulting absence of spaces in the ciphertext helps make
                                   the message more unintelligible. If one does not know the size and direction of the
                                   shift for decryption, the presence of spaces may suggest certain information about the
                                   structure of the original message.)
                               b) The idea of Caesar’s cipher can be generalized and modeled mathematically by using
                                  the concept of congruence. Start by assigning each of the 26 letters of the plaintext a
                                  nonnegative integer as shown:

abe                     ds+++      k      €    m      nN     «ss      Ww   x    yp    Z
                                                0         1       2     3.   ---   10     IL   12     13     ---     22    23   24 25
                                                                    14.3. The Integers Modulo n             691

The 26 letters for the ciphertext are assigned the same integers — that is, A is assigned
     0, B is assigned 1, ..., Y is assigned 24, and Z is assigned 25.
        Now select a nonnegative integer «, where 0 < x < 25. For instance, Caesar chose
     k = 3. This integer « is called the key and helps us define the encrypting function
     E: Zx, —         Zag as follows. Given a letter of the plaintext, let 6 be the nonnegative integer
     to which it corresponds. Then £(6) = (@ + «) mod 26 and this result determines the
     corresponding ciphertext letter for the plaintext letter assigned the nonnegative integer
     6. To decrypt we apply the inverse function D: Z26 —                      Zog where we write D(@)        =
     (8 — x) mod 26. Replacing each nonnegative integer with its corresponding plaintext
     letter, one captures the plaintext version of the original message.
         If we do not know the key, a trial-and-error approach can be used. There are 26
     possibilities — one for each of the 26 possible values of «. A more efficient method of
     attack takes into account the most frequently occurring letters in the alphabet and the
     most frequently occurring letters in the ciphertext. In the English language, the letter
     e occurs most often, with t, a, o andi the next four most frequently occurring letters.
         Now if a parent receives the ciphertext Z LU K TV YLTVULF
     from a college student, and does not know the key, what can the parent do? Since
     the most frequently occurring letter in the ciphertext is L, the parent can corre-
     spond e with L under the encryption.             This      suggests that F: Z216 >        Zog   be defined
     by E(6) = (6 + 7) mod 26, since L is seven places after e in the alphabet. So here the
     key, «,is 7 and the decryption function is D: Zog —                Z26 with D(@)       = (6 — 7) mod 26.
        Decoding the ciphertext message received by the parent can be analyzed as follows:

(q)       ZL         U    K      T     V      Y     LT            V     UL            F
                (2)      25 11      20    10     19   21     24     11 19        21     20 11         5
                (3)      18 4       13     3     12   14     17     4 12         14     13 4         ~«24
                (4)       s     e    n    demo             or       e      m      o     n      e      y

Here (1) provides the given (encrypted) ciphertext. In (2) each ciphertext letter is re-
     placed by the nonnegative integer assigned to it. Upon applying the decryption function
     D, the results in (2) provide the assignments in (3). Replacing each nonnegative integer
     in (3) by its corresponding plaintext letter yields the original message

“Send more money.”

c) The security of the shift cipher in part (b) can be slightly enhanced by means of
     the affine cipher. The letters of the plaintext and ciphertext are assigned nonnegative
     integers, as in part (b). Here, however, the encryption function EF is given by E(@) =
     (a0 + «) mod 26, where 0 <a, « < 25, and gced(a, 26) = 1.
        If 6), 62 € Zoe, then E(6,;) = E(02) > (a6, + «) mod 26 = (a4. + «) mod 26 >
     a6, mod 26 = w@; mod 26 > 6; = 62, by Theorem                       14.14. So E is one-to-one.         Fur-
     ther, E is also onto and invertible, by Theorem 5.11, because Zy¢ is finite.

Let us consider a specific example. Suppose a = 11 and « = 7. Then the encryption of
the plaintext letter g proceeds as follows:

i) g is assigned the nonnegative integer 6;
   ii) applying E, we have E(6) = (11-6+ 7) mod 26 = 73 mod 26 = 21; and
  iii) the nonnegative integer 21 determines the ciphertext letter V.
692   Chapter 14 Rings and Modular Arithmetic

[So using this affine cipher, where E(0) = (116 + 7) mod 26, the plaintext letter g is en-
                      crypted as the ciphertext letter V.]
                          Now suppose we have the following ciphertext for a message encrypted by an affine
                      cipher:

QYYFGCULBLKYZVOSTCOY                                            PURGCULYZYWKYOSTCOYL

With no knowledge of a or «, one might have to examine as many as [(26)](26) =
                       [26 (1 — 3) (1 — 35)] (26) = [26 (4) (8)] (26) = (12)(26) = 312 cases for the key a, k.
                      However, let’s say that by some means      — perhaps by considering the frequencies of oc-
                      currence for the letters in the plaintext and ciphertext— we deduce two correspondences.
                      Specifically, we know that e and Y correspond, as do t and R. In addition, the nonnegative
                      integers 4 and 19 are the replacements for the plaintext letters e and r, respectively, while
                      24 and 17 are the respective replacements for Y and R, in the ciphertext, so the encryption
                      function E is determined as follows:

1) The correspondence of e(4) and Y (24) tells us that E(4) = (4a + «) mod 26 = 24.
                      2) The correspondence of t(19) and R(17) tells us that E(19) = (19a + «) mod 26 = 17.

Consequently, £(19) — E(4) = [(19a + «) — (4a + «)| mod 26 = 15a mod 26 =
                      (17 — 24) mod 26 = —7 mod 26 = 19.                                Since         15-7 = 105 = 14+ 104 = 1 + 4(26),                             we
                      have 15-7 = 1 mod 26, so 157! = 7 (in Z6). Then 15a = 19 mod 26 >
                      a = 15~'- 19 mod 26 = 7- 19 mod 26 = 133 mod 26 = 3, as 133 = 3 + 5(26).
                      With a = 3 mod 26 it now follows from (1) that « = (24 — 4a) mod 26 =
                      (24 — 12) mod 26 = 12. [Or, from (2), « = (17 — 19a) mod 26 = (17 — 57) mod 26 =
                      —40 mod 26 = 12.]
                            Consequently,      EF: Z25 —        Zo       is defined by E(@)                = (36 + 12) mod 26 and the decryp-
                      tion    function    D: Z2 — Zr             is given               by       D(6) = (96 +22) mod 26,                          since   E~'(@) =
                      3~'(6 — 12) mod 26 = 9( — 12) mod 26 = (96 — 108) mod 26 = (96 + 22) mod 26.
                      This function D is used in the following to obtain the results in row 3 from the nonnega-
                      tive integers (that replace the ciphertext letters) in row 2.
                      (1) Ciphertext      Q     Y   Y      F     GC            U             LBL           K    Y       ZVOS                         TC        OY
                      (2)                 16   24   24      5        6     2       20    11       1   IL   10   24      25        21   14    18     19    2    14 24
                      (3)                 10    4   4      15    24       14       20    17       5   17   8    4       13        3    18    2      11    14   18   4
                      (4)    Plaintext    k     e   e      p         your                        friends                                     ¢       €@   @    5s   e

(1)    Ciphertext   P     U   R      GCU                     LY             ZYWK                       ¥Y    O§         TCOYL
                      (2)                 15   20   17     6         2    20       II    24      25   24   22   10      24        14    18   19       2   14   24   Il
                      (3)                 1    20   19     24    14       20       17        4   13    4   12       8    4        18    2    11      14   18    4   17
                      (4)    Plaintext    b    uw   tf     yo             ur                 enemies
                                                                                                 ¢ @o gs er

Here, for example, the ciphertext letter Q is replaced by the nonnegative integer 16.
                      Applying the decryption function D to 16 we have D(16) = (9- 16+ 22) mod 26 =
                      166 mod 26 = 10, and 10 is the nonnegative integer that corresponds to the plaintext
                      letter k.
                          The decrypted message now reveals the sage advice given by Don Vito Corleone (of
                      Mario Puzo’s The Godfather) to his youngest son, Michael — namely, “Keep your friends
                      close but your enemies closer.”
                                                                                          14.3. The Integers Modulo n              693

The security of each of the cryptosystems in Example 14.15 depends on the key [x = 3
                  in part (a), « in part (b), and a, « in part (c)]. For such private key cryptosystems, the
                  two people wishing to use the system need to securely exchange the key. Should any
                  unauthorized person discover the key, then that person could readily encrypt or decrypt
                  messages.

Our next example deals with modular exponentiation.

| EXAMPLE 14.16   In the study of cryptology” one often needs to perform modular exponentiation to compute
                  a result such as b* mod n, where b, e, and n are large integers. To demonstrate this    — on
                  a somewhat smaller scale—let us determine 5'*3 mod 222. We realize that it is rather
                  inefficient to actually compute 5'*? (a very large integer) and then find the remainder
                  upon dividing the result (for 5'**) by 222. A more efficient approach starts with the binary
                  representation for the exponent — here, 143. With

143 = 1(128) + 0(64) + 0(32) + 0(16) + 1(8) + 1(4) + 1(2) +101)
                                           1(27) + 0(2°) + 0(2°) + 0(2*) + 1(23) + 1(27) + 102!) + 12°)
                                           (10001111)>,
                  we compute 5143 mod 222 by using the binary representation (of 143) in reverse order —
                  that is, going from the right to the left. The pseudocode procedure in Fig. 14.1 provides
                  the necessary steps for this computation. Here the input is an integer b, the positive integer
                  n (the modulus),          and the binary representation        (@y,@,,_; - + - @42@\a9)2 for the exponent e,
                  another positive integer. The output x equals b° mod n.

procedure             ModularExponentiation(b:                integer;
                                                       nN,   @=   (@n@n-1''°A241a0)2:         positive        integers)
                            begin
                               xX   :=l1
                               power         :=   bmodn
                               for i=0tomdo
                                  begin
                                     if a, =1thenx := (x * power) modn
                                    power := (power * power) modn
                                    end
                            end

Figure 14.1

For our example,             b = 5, e = 143 = (10001111).           = (ajagas - - - a2a,ag)2        [So m = 7],
                  and n = 222. The results in Table 14.9 show us the steps that are followed in the execution
                  of the for loop. This is after the initial assignments are made: x is 1 and power is b mod n —
                  that is, 5 mod 222 = 5.
                      Following the execution of this procedure, the last entry in the column for x tells us that
                  5'49 mod 222 is 89.

*For more on cryptology (and related topics), the reader should find the references by T. H. Barr [3], P. Garrett
                  [6], and W. Trappe and L. C. Washington [13] of interest.
694         Chapter 14 Rings and Modular Arithmetic

Table 14.9
                                       tL | a                     x                                 power

oO}   1             lx5=5                         5* (= 25) mod 222 = 25
                                        1]  1    5 «25 mod 222 = 125       25° (= 625) mod 222 = 181
                                       2]   1 | 125* 181 mod 222 = 203 | 181° (= 32761) mod 222 = 127
                                       3. | 1 | 203 * 127 mod 222 = 29 | 1277 (= 16129) mod 222 = 145
                                       4]   0             29             145? (= 21025) mod 222 = 157
                                       5| 0               29               157? (= 24649) mod 222 = 7
                                       6| 0                29                 7? (= 49) mod 222 = 49
                                       7/1       29 x 49 mod 222 = 89     49? (= 2401) mod 222 = 181

The next    example   provides       an application   of modular    congruence      in information
                             retrieval.

When searching a table of records stored in a computer, each record is assigned a memory
      EXAMPLE 14.17
                             location or address in the computer’s memory. The record itself is often made up of fields
                             (this has nothing to do with ring structures). For instance, a college registrar keeps a record
                             on each student, with the record containing information on the student’s social security
                             number, name, and major, for a total of three fields.
                                  In searching for a particular student’s record, we can use his or her social security number
                             as the key to the record because it uniquely identifies that record. As a result, we develop a
                             function from the set of keys to the set of addresses in the table.
                                  If the college is small enough, we may find that the first four digits of the social security
                             number are enough for identification. We develop a hashing (or scattering) function h from
                             the set of keys (still social security numbers) to the set of addresses, determined now by the
                             first four digits of the key. For example, 4(081-37-6495) identifies the record at the address
                             associated with 0813. In this way we can store the table using at most 10,000 addresses.
                             All is well as long as d is one-to-one. Should a second student have social security number
                             081-39-0207, then # would no longer uniquely identify a student’s record. When this hap-
                             pens, a collision is said to occur. Since increasing the size of the stored table often results in
                             more unused storage, we must balance the cost of this storage against the cost of handling
                             such collisions. Techniques for resolving collisions have been devised. They depend on the
                             data structures (such as vectors or linear linked lists) that are used to store the records.
                                  Different kinds of hashing functions that have been developed include the following.
                               a) The division method:     Here we restrict the number           of addresses   we   want to use to
                                   a fixed integer n. For any key k (a positive integer), we define h(k) =r, where r =
                                   &k mod
                                       n — that is, r =k      (modn) andQ <r           <n.
                               b) Often implemented is the folding method, where the key is split into parts and the
                                  parts are added together to give A(key). For example, 4(081-37-6495) = 081 + 37+
                                  6495 = 6613 utilizes folding, and if we want only three-digit addresses, suppressing
                                   the first digit 6, we can have 4(081-37-6495)        = 613.

The importance of choosing a pertinent hashing function cannot be emphasized enough
                             as we try to improve efficiency in terms of greater speed and Jess unused storage.
                                                                                 14.3. The Integers Modulo n          695

Using the modular concept, we can develop a hashing function h, using the same keys
                as above, where

A(x | X2X3-X4X5-X6X7XgX9) = Yr N23,
                with
                                                     yy = (41 + x2 +3)    mod    5

y2 = (x4 + x5) mod 3
                                                     y3 = (%6 + x7 + Xg + xo) mod 7.

Here, for example, #(081-37-6495) = 413.

Our last example for this section provides one more encounter with the Catalan numbers
                (of Sections 1.5 and 10.5).

In how      many    ways   can we     select three elements   a, b, c from   {0, 1, 2, 3}, if repetitions
EXAMPLE 14.18
                are allowed and we want a + b + c=0 (mod 4)? The selections are listed in column 1 of
                Table 14.10. (Here each selection sums to 0, 4, or 8, and order is not relevant. For instance,
                a=0,b=1,c =3 is considered the same selection as a = 1, b = 0, c = 3.) We see that
                there are five such selections and we recall that 5 = (<4) (73°), the third Catalan number.
                Furthermore, by adding 1 to each entry of the selection 0, 0, 0 (in row 1 and column 1) we
                obtain the selection 1, 1, 1 (in row 1 and column 2). Likewise, the selection 2, 3, 1 (inrow 2
                and column 3) arises by adding 2 to each entry of the selection 0, 1, 3 (in row 2 and column
                1) and reducing each sum modulo 4. Similar computations provide the other 13 selections
                in columns 2, 3, 4.

Table 14.10

Sum Is 0 (mod 4) | SumIs3            (mod 4) | SumIs2         (mod 4) | Sum Is 1 (mod 4)

0, 0, 0                  1,1,1                   2,2,2                    3, 3,3
                       0, 1,3                  1, 2,0                   2, 3,1                   3, 0, 2
                       0, 2, 2                 1, 3,3                   2, 0,0                   3, 1,1
                        1,1,2                  2, 2,3                   3, 3,0                   0, 0,1
                       2, 3,3                  3, 0, 0                  0, 1,1                   1, 2,2

To generalize this result, we count the number of selections x|, x2, ..., X,, from {0, 1, 2,
                3,...,n}, where repetitions are allowed and x; + x2 +---+x, =O (mod n + 1). From
                Section 1.4 we know there are ("+1 +"~1) = (*) ways to select n objects from 2 + |
                distinct objects, with repetitions allowed. Let Se/,, denote the set of these (7")              selections.
                (The 20 selections in Table 14.10 illustrate Se/3.) Define the relation # on Sel, by s; KR so,
                if the sum of the entries in selection s; is the same, modulo n + 1, as the sum of the entries
                in selection s2. Then &       is an equivalence relation, so Sel,       can be partitioned into n + 1
                equivalence classes (one for each of the selection sums 0, 1, 2,...,2—taken                        modulo
                n+      1). [Note: We get all n + 1 possible selection sums, for if 0< k; <n, 0 <k2 <n, and
                nk; =nky         (mod n + 1), then kj =k2       (mod n+ 1). This is due to Theorem             14.14 since
                gcd(n, n+        1) = 1. With ky, ko € {0, 1, ..., m} it then follows that k, = k2.]
                  For 0<s <n, let Sel* denote the selections that sum to s, modulo n+ 1. When
                l<s<n, write s=nk (for k=n~'s). Define f: Sel? > SelS as follows. For
696             Chapter 14 Rings and Modular Arithmetic

{x}, x2,..., Xn} € Sel?,             fCUx1, X2,.. ~oXn}) = fe, tk, xo tk,..., xX, +k},                             where
                                       x; +k     is    reduced      modulo    n+   1.   Now        consider         {yj, yo,.       ., Yn} € Sel}    and   define
                                       g: Sel’ > Sel® by g({y1, y2,--   JY) ={yit@t+l—k)rt+m4+1—k),...,
                                        Yn + (n + 1 — k)}. One finds
                                                                  that g = f—! so | Sel? | = |Sel,| See = | Sel” |. Consequently,
                                       each equivalence class has the same size, namely, (5)                               (2"), the nth Catalan number.

12. Find the multiplicative inverse of each element in Z),, Z)3,
                           EXERCISES 14.3                                          and Z).

1. a) Determine whether each of the following pairs of inte-                      13. Find [a]~!           in Zyoy9       for (a) a = 17,    (b)   a = 100,    and
                                                                                   (c)a = 777.
    gers is congruent modulo 8.
         i)   62,118       ii)       —43, —237           iii) —90, 230             14, a) Find all subrings of Z,2, Z,g, and Zo4.

b) Determine whether each of the following pairs of inte-                         b) Construct the Hasse diagram for each of these collec-
      gers is congruent modulo 9.                                                       tions of subrings, where the partial order arises from set
                                                                                        inclusion. Compare these diagrams with those for the set
         i)   76, 243          ii}   —137, 700          iii}    —56, —1199
                                                                                        of positive divisors of n (n = 12; 18; 24), where the partial
2. For each of the following determine the value(s) of the in-                         order now comes from the divisibility relation.
teger n > 1 for which the given congruence is true.                                     c) Find the formula for the number of subrings in Z,,n > 1.
      a) 28 =6 (mod n)                     b) 68 = 37 (mod n)
                                                                                   15. How many units and how many (proper) zero divisors are
      c) 301 = 233 (mod nv)                d) 49 =2 (mod n)                        there in (a) Zy7? (b) Z417?             (c) Zi117?

3. List four elements in each of the following equivalence                       16. Prove that in any list ofn consecutive integers, one of the
classes.                                                                           integers is divisible by n.
      a) [1] in Z,              b) [2] in Zi;            c) [10] in Z)7            17. If three distinct integers are randomly selected from the set
4. Prove thatifa, b,c,n             € Z witha, n > O, and                         {1, 2,3,..., 1000}, what is the probability that their sum is
b=c     (modn), then ab =ac (modan).                                               divisible by 3?
5. Leta, b,} m,n       © Zwith m,n        > O. Prove that                         18. a) For c, d,n, m € Z, with n > 1 and m > 0, prove that
if a= b (mod n) and m|n, then a = b (mod m).                                           if c=d(modn), then mc =md (modn) and c” =a”
                                                                                       (mod n).
6. Let m,n € Z* with gcd(m, n) = 1 and let a, b € Z.
                                                                                        b)    If   AnXn—1    °   1 Xj XQ   = Xy + 10°   +---4+x,-    10+   xo   de-
Prove that a == b (mod m) and a = b (mod n) if and only if
a = 6b (mod mn).                                                                        notes an (n + 1)-digit integer, then prove that

7. Provide a counterexample to show that the result in the                                  XnXp—1 0       XyXQ HXq FXn-1 +--+ +41 + Xo (mod 9).
preceding exercise is false if ged(m, n) > 1.                                      19. a) Prove that for all nm € N, 10” = (—1)" (mod 11).
  8. Prove that for all integers n exactly one of n, 2n — 1, and                        b) Consider the result for mod 9 in part (b) of Exercise 18.
2n + 1 is divisible by 3.                                                               State and prove a comparable result for mod 11.

9. Ifn € Z* and n > 2 prove that                                                  20. For p aprime determine all elements a € Z, where a? = a.
                n-1                                                                21. For a, b,n€ Z* and n > 1, prove that a =b (modn) >
                           Q         (modn),          n odd
                 , i=                                                              ged(a, n) = ged(b, vn).
                           5         (modn),          n even.
                i=]
                                                                                   22. a) Show that for all [a] € Zy, if [a] # [O], then
10. Complete the proofs of Theorems 14.11 and 14.12.
                                                                                                                             [a]° = [1].
11. Define relation R on Z* by a Vb, if t(a) = t(b), where
t(a) = the number of positive (integer) divisors of a. For ex-                          b) Letn € Z* with gcd(n, 7) = 1. Prove that
ample, 2& 3 and 4 & 25 but 5 RY.                                                                                            7\(n° — 1).
      a) Verify that A is an equivalence relation on Zr.                           23. Use the Caesar cipher to encrypt the plaintext: “All Gaul is
      b) For the equivalence classes [a] and [b] induced by &%,                    divided into three parts.”
      define operations of addition and multiplication by [a] +                    24, The ciphertext FT Q1MKIQIQDQ was encrypted us-
      [b] = [a + b} and [a}|b] = [ab]. Are these operations well-                  ing the encryption function E: Zo5 — Zo where E(@) =
      defined [that is, deesaRe, bRd> (a+b R(c+a),                                 (6 + x) mod 26. Considering the frequencies of occurrence for
      (ab) R (cd)}?                                                                the letters in the ciphertext, determine (a) the key « for this
                                                                                14.4 Ring Homomorphisms and lsomorphisms                 697

cipher shift; (b) the decryption function D; and (c) the original         35. For the hashing function at the end of Example 14.17, find
(plaintext) message.                                                      (a) h(123-04-2275); (b) a social security number # such that
25. Determine the total number of affine ciphers for an alphabet          h(n) = 413, thus causing a collision with the number 081-37-
of (a) 24 letters; (b) 25 letters; (c) 27 letters; and (d) 30 letters.    6495 of the example.
26. The ciphertext                                                        36. Write a computer program (or develop an algorithm) that
                                                                          implements the hashing function of Exercise 35.
           RWIWQTOOMYHKUXGOEMYP
                                                                          37. The parking lot for a local restaurant has 41 parking spaces,
was encrypted with an affine cipher. Given that the plaintext
                                                                          numbered consecutively from 0 to 40. Upon driving into this
letters e, f are encrypted as the ciphertext letters W, X, respec-
                                                                          lot, a patron is assigned a parking space by the parking atten-
tively, determine (a) the encryption function E; (b) the decryp-
                                                                          dant who uses the hashing function A(k) = k mod 41, where
tion function D; and (c) the original (plaintext) message.
                                                                          k is the integer obtained from the last three digits on the pa-
27. (a) How many distinct terms does the linear congruential              tron’s license plate. Further, to avoid a collision (where an oc-
generator with a = 5, c = 3, m         = 19, and x9 = 10, produce?
                                                                          cupied space might be assigned), when such a situation arises,
(b) What is the sequence of pseudorandom members generated?               the patron is directed to park in the next (consecutive) available
28. Given the modulus m and the two seeds xy, x), with 0 <                space — where 0 is assumed to follow 40.
Xq, X; < m,asequence of pseudorandom numbers can be gener-
                                                                              a) Suppose that eight automobiles arrive as the restaurant
ated recursively from x, = (%,-; + X,-2) mod m, n > 2. This
                                                                              opens. If the last three digits in the license plates for these
generator is called the Fibonacci generator.
                                                                              eight patrons (in their order of arrival) are
    Find the first ten pseudorandom numbers generated when
m = 37 and x9 = 1, x, = 28.                                                             206, 807, 137, 444, 617, 330, 465, 905,
29. Let x4; = (ax, +c) mod m, where 2<a<m,0<c <                               respectively, which spaces are assigned to the drivers of
m,O0<XxX) <m,0<x,,) <m,andn > 0. Prove that                                   these eight automobiles by the parking attendant?
  X, = (a"xy9 + cl(a" — 1I)/(a — 1)]) mod m, 0 <x, <m.                        b) Following the arrival of the eight patrons in part (a), and
30. Consider    the   linear   congruential   generator   with   a = 7,
                                                                              before any of the eight could leave, a ninth patron arrives
c=4,and m = 9. If x4 = 1, determine the seed xp.                              with a license plate where the last three digits are OOx. If
                                                                              this patron is assigned to space 5, what is (are) the possible
31. Prove that the sum of the cubes of three consecutive integers
                                                                              value(s) of x?
is divisible by 9.
                                                                          38. Solve the following linear congruences for x.
32. Determine the last digit in 3°.
                                                                              a) 3x =7 (mod 31)               b) 5x = 8   (mod 37)
33. For m,n, r € Z*, let p(m, n, r) count the number of par-
titions of m into at most # (positive) summands each no larger                c) 6x = 97 (mod 125)
than r. Evaluate     an pik(n + l),ny,n)ne Ze.
34, Given a ring (R, +, +), an element r € R is called idempo-
tent whenr? = r.Ifn € Z* withn > 2, prove that ifk € Z, and
k is idempotent, then n — k + 1 is idempotent.

14.4
Ring Homomorphisms and Isomorphisms
                                   In this final section we shall examine functions (between rings) that obey special properties
                                   which depend on the closed binary operations in the rings.

|EXAMPLE 14.19                     Consider the rings (Z, +, -) and (Z,., +, -), where addition and multiplication in Z, are as
                                   defined in Section 14.3.
                                      Define f: Z— Ze by f(x) = [x]. For example, f(1) = [1] = [7] = f(7) and f(2) =
                                   f(8) = f(2 + 6k) = [2], for all k € Z. (So f is onto though not one-to-one.)
                                      For 2, 3 € Z, f (2) = [2],
                                                               f 3) = [3] and we have f (2 + 3) = f(5) = [5] = [2] + [3] =
                                   f(2) + f(3), and f (2-3) = f(6) = [0] = [2J[3] = f(2)- fF).
698          Chapter 14 Rings and Modular Arithmetic

In fact, for all x, y € Z,

fiat+y)=[x+y]l=O14+01=                         f@)
                                                                                                            + fo),
                                                                     t                                              t
                                                             Addition in Z                                   Addition in Z,

and

fy)           = ley] = [Ixlly] = fQ)- £Q).
                                                                      t                                  t
                                                                  Multiplication in Z                Multiplication in Z,

This example suggests the following definition.

Definition 14.8         Let (R, +, -) and(S, @, ©) be rings. A function f: R — S is called a ring homomorphism
                              if for alla, be        R,

a) f(a +b) = f(a) ® f(b), and
                                b) f(a-b) = f(a) © f(b).
                              When the function f is onto we say that S is a homomorphic image of R.

This function is said to preserve the ring operations for the following reasons: Consider
                              f(at+b) = f(a) ® f(b). Adding a, b in R first and then finding the image (under f) in
                              S of this sum, we get the same result as when we first determine the images (under /f) in
                              S of a, b, and then add these images in S. (Hence we have the function operation and the
                              additive operations commuting with each other.) Similar remarks can be made about the
                              multiplicative operations in the rings.
                                    For the rings Z4 and Zg, define the function f: Z4 > Zs by f ({a]) = la]? (= [a7]). Then
                              for all [a], [b] € Z4, we have

f((al{b) = f (abl) = [abe = (al(b)? = faPtbP = f(a) fue.
                                                 t
                                           Multiplication in Z,
                                                                                               1                            Multiplication in Z,

Consequently, this function f preserves the multiplicative operations in the rings. However,
                              for (1], [2] € Z4, we find that f ([1] + [2]) = f(13)) = [3 = [1], while f ((1]) + f([2]) =
                              [1]? + (2)? = [114+ [41 = 5] (# [1] in Ze). So f does not preserve the additive operations
                              in the rings —hence, f is not a ring homomorphism.
                                    The function g: Z4 >                 Zg, defined by g([a]) = 3[a], preserves the additive operations,
                              but not the multiplicative operations, in the rings.

Definition 14.9         Let f: (R, +, +) > (S, ®, ©) be a ring homomorphism. If f is one-to-one and onto, then
                              f is called a ring isomorphism and we say that R and S are isomorphic rings.

We can think of isomorphic rings arising when the “same” ring is dealt with in two dif-
                              ferent languages. The function f then provides a dictionary for unambiguously translating
                              from one language into the other.
                                  The terms“homomorphism” and “isomorphism” come from the Greek, where morphe
                              refers to shape or structure, homo                    means similar, and iso means identical or same. Hence
                              homomorphic rings (that is, rings where one is a homomorphic image of the other) may
                                                             14.4 Ring Homomorphisms and Isomorphisms          699

be thought of as similar in structure, while isomorphic rings are (abstractly) replicas of the
                  same structure.
                      In Definition 11.13 we defined the concept of graph isomorphism. There we called the
                  undirected graphs G; = (V;, £,)         and Gz = (V2, Ez)   isomorphic   when   we could find a
                  function f: V; — V2 such that

a) f is one-to-one and onto, and

b) {a, b} € E; ifand only if {f (a), f(b)} € E>.
                  In light of our statements about ring isomorphisms, another way to think about condition (b)
                  here is in terms of the function f preserving the structures of the undirected graphs G, and
                  G2. When |V,| = | V3}, it is not difficult to find a function f: V; — V> that is one-to-one and
                  onto. However, for a given set V of vertices, what determines the structure of an undirected
                  graph G = (V, E) is its set of edges (where the vertex adjacencies are defined). Therefore
                  a one-to-one correspondence f: V; — V2 is a graph isomorphism when it preserves the
                  structures of G; and G2 by preserving these vertex adjacencies.

For the ring R in Example 14.5 and the ring Zs, the function f: R > Zs given by
  EXAMPLE 14.20
                           f(a) = [0],      f(b) = (1),       fc) = [2],       F(d) = [3],        fle) = [4]
                  provides us with a ring isomorphism.
                     For example, f(c + d) = f(a) = [0] = [2] + [3] = f(c) + f(d), while f(be) = fe)
                  = [4] = [1][4] = f()) f(e). Un the absence of other methods and theorems, there are 25
                  such equalities that must be verified for the preservation of each of the binary operations.)

Inasmuch as there are 5! = 120 one-to-one functions from R onto Zs, is there any as-
                  sistance we can call upon in attempting to determine when one of these functions is an
                  isomorphism? Suggested by Example 14.20, the following theorem provides ways of at
                  least starting to determine when functions between rings can be homomorphisms and iso-
                  morphisms. [Parts (c) and (d) of this theorem rely on the results of Exercises 20 and 21 in
                  Section 14.2.]

THEOREM 14.15     If f: (R, +, -) >      (S, @, ©) is a ring homomorphism, then

a) f (Zr) = Zs, where zr, Zs are the zero elements of R, S, respectively:
                    b) f(—a) = —f (a), for alla € R;
                    c) f(na) = nf (a), foralla eR,         ne Z;
                    d) f(a") =[f(a)]", forallae R,          ne Zt; and
                    e) if A is a subring of R, it follows that f(A) is a subring of S.
                  Proof:

a) zs @ f (zr) = f(zr) = flere +Zr) = f(zr) ® f(zr). (Why?) So by the cancella-
                       tion law of addition in S$, we have f (zr) = Zs.
                    b) zs = f(zr) = f(a + (—a)) = f(a) @ f (—a). Since additive inverses   in S are unique
                       and f(—a) is an additive inverse of f(a), it follows that f(—a) = — f(a).
700         Chapter 14 Rings and Modular Arithmetic

c) If n =0, then f(na) = f(zr) = zs = nf(a). The result is also true for n = 1, so
                                  we assume the truth for n = k (> 1). Proceeding by mathematical induction, we
                                  examine the case where n = k + 1. By the results of Exercise 20 of Section 14.2,
                                  we get f((k + la) = f(ka +a) = f(ka) @ f(a) = kf (a) @ f(a) (Why?) =
                                   (k + 1)(f (a)) (Why?). (Note: There are three different kinds of addition here.)
                                      When    n > 0, f(—na) = —nf (a). This follows from our prior proof by induc-
                                  tion, part (b) of this proof, and part (b) of Theorem 14.1, because f(—na)          + f (na) =
                                   f(n(—a)) + f (na) = nf (—a) + nf (a) = nf f(—a) + f(a) = nl— f(a) + fla)l=
                                  nzs = Zs. Hence the result follows for all n € Z.
                               d) We leave this result for the reader to prove.
                               e) Since A #¥, f(A) AO. If x, y € f(A), thenx = f(a), y = f(b) for somea, DEA.
                                  Then x ® y= f(a) @ f(b) = f(a +b), and x Oy = f(a)     O f(b) = flab), with
                                  a+b,abeA(Why?),sox @y,x Oy € f(A). Also, ifx € f(A) thenx = f(a) for
                                  some a € A. So we have f(—a) = — f(a) = —x, and because —a € A (Why?), we
                                  have —x € f(A). Therefore f(A) is a subring of S.

When the homomorphism is onto, we obtain the following theorem.

THEOREM 14.16                If f: (R, +, -) >    (S. ®, ©) is aring homomorphism         from R onto S, where |S| > 1, then

a) if R has unity wr, then f (up) is the unity of S;
                               b) if R has unity up anda isaunit in R, then f(a) is aunit in S and f(a~!) = [f(a)]';
                               c) if R is commutative, then S is commutative; and
                               d) if 7 is an ideal of R, then f(/) is an ideal of S.
                            Proof: We shall prove part (d) and leave the other parts to the reader. Since / is a subring of
                             R, it follows that f(/) is a subring of S$ by part (e) of Theorem         14.15. To verify that f(/)
                             is an ideal, letx € f(/) ands € S. Thenx = f(a) ands = f(r), forsomea € /,r € R. So
                             SOx = f(r) © f(a) = f(ra), withra € 1,andwehaves © x € f (/). Similarly,x Os €
                             f (1), so fC) is an ideal of S.

These theorems reinforce the way in which homomorphisms and isomorphisms preserve
                             structure. But can we find any use for these functions, aside from using them to prove more
                             theorems? To help answer this, we start by considering the following example.

Extending the idea developed in Exercise 18 of Section 14.2, let R be the ring Z2 X Z3 X Zs.
      EXAMPLE 14.21
                            Then |R| = |Zo} - |Zs| - |Z5| = 30, and the operations of addition and multiplication are
                            defined in R as follows:
                                For all (a1, a2. a3), (b1, bo, b3) © R where ay, b) € Zo, ao, bo € Zs, and az, b3 € Zs,

(a;, 42, a3) + (by, bo, b3) = (ay + by, ao + bo, a3 + b3)
                                                              t                    t        t        t
                                                          Addition             Addition     Addition    Addition
                                                             inR                  in Zy       in Zs       in Z;
                                                        14.4. Ring Homomorphisms and lsomorphisms                 701

and

(4), G2, 43) - (by, b2, b3) = (ay - by, a2 - bz, a3 + b3).

Multiplication     Multiplication   Multiplication   Multiplication
                                    inR                in Z,           inZ;             in Z,

Define the function f: Z39 > R by f (x) = (41, x2, x3), where
                                                     x; = x mod 2

xX. = x mod 3
                                                     x3 =x mod 5.
In other words, x1, x2, and x3 are the remainders that result when x is divided by 2, 3, and
5, respectively.
    The results in Table 14.11 show that f is a function that is one-to-one and onto.

Table 14.11

x (in Z3o) | f(x) Gn R) | x Gin Z39) | f(x) Gn R) | x (in Z30) | f(x) Gin R)
             0                    (O, 0, 0)        10             (0, 1, 0)         20                (0, 2, 0)
              1                   (1, 1, 1)        11             (1, 2, 1)         21                (1, 0, 1)
             2                    (0, 2, 2)        12             (0, 0, 2)         22                (O, 1, 2)
             3                    (1, 0, 3)        13             (1, 1, 3)         23                (1, 2, 3)
             4                    (0, 1, 4)        14             (O, 2, 4)         24                (O, 0, 4)
             5                    (1,2, 0)         15             (1, 0, 0)         25                (1, 1, 0)
             6                    (0, 0, 1)        16             (0, 1, 1)         26                (0, 2, 1)
             7                    (1, 1, 2)        17             (1, 2, 2)         27                (1, 0, 2)
             8                    (0, 2, 3)        18             (0, 0, 3)         28                (0, 1, 3)
             9                    (1, 0, 4)        19             (1, 1, 4)         29                (1, 2, 4)

To verify that f is an isomorphism, let x, y € Z3o. Then
           f(x+y)            = ((* + y) mod 2, (x + y) mod 3, (x + y) mod 5)
                             = (x mod 2, x mod 3, x mod 5) + (y mod 2, y mod 3, y mod 5)
                             = fix) + fo),
and
              fy)            = (xy mod 2, xy mod 3, xy mod 5)
                             = (x mod 2, x mod 3, x mod 5) - (y mod 2, y mod 3, y mod 5)
                             = f(xXf),
so f is an isomorphism.

In examining Table 14.11 we find, for example, that

1) f (0) = (0, 0, 0), where O is the zero element of Z39 and (0, 0, 0) is the zero element
         of Z>      x   Z3    x    Zs.

2) f(2+4) = f(6) = (, 0, 1) = ©, 2, 2) + (0, 1, 4) = f(2) + FA).
      3) The element 21 is the additive inverse of9 in Z39, whereas f(21)                           = (1, 0, 1) is the
         additive inverse of (1, 0, 4) = f(9) in Zo X Zs X Zs.
702      Chapter 14 Rings and Modular Arithmetic

4) {0, 5, 10, 15, 20, 25} is a subring of Z39 with {(0, 0, 0) (= f(0)), C1, 2, 0) (= f(5)),
                                 (0, 1, 0) (= f(10)), C1, 0, 0) (= F(15)), (, 2, 0) (= f(20)), C1, 1, 0) (= f(25))}
                                 the corresponding subring in Zz X Z3 X Zs.

But what else can we do with this isomorphism between Z3q and Z2 X Z3 X Zs? Sup-
                          pose, for example, that we need to calculate 28 - 17 in Z39. We can transfer the problem to
                          Z2. X Z; X Z; and compute             f(28)- f(17) = (0, 1,3)- (1,2, 2), where the moduli
                          2, 3, and 5 are smaller than         30 and easier to work with. Since (0, 1, 3)- (1, 2,2) =
                          (0-1, 1-2,3-2) = (0, 2, 1) and f~'(0, 2, 1) = 26, it follows that 28- 17 (in Z39) is 26.

In Example       14.21 we see that if we are given an element (x), x2, x3) in Z2 X Z3 X Zs,
                          then we can use Table 14.11 to find the unique element x in Z39 so that f (x) = (41, x2, X3).
                          But what would we do if we did not have such a table        — especially, if we found our-
                          selves working with larger rings, such as Z32736 and Z3;            X Z32 X Z33, and the isomorphism
                          g: £32736 > Zs, X Z32 X Za3 where g(x) = (x mod 31, x mod 32, x mod 33) for x €
                          232736? The following result provides a technique for determining the unique preimage for
                          a given element of the codomain for such an isomorphism g.

THEOREM 14.17             The Chinese Remainder          Theorem.     Let m,,m2,...,m,            € Zt — {1} with k > 2, and with
                          gcd(m;, m;) = 1 for all 1 <i         < 7 <k. Then the system of& congruences

xX =a, (modm),)
                                                                       xX =a> (mod m2)

x =a, (mod m,)

has a simultaneous solution. Further, any two such solutions of the system are congruent
                          modulo m,m-++            my.
                          Proof: We start by showing how to construct a simultaneous solution of the system of k
                          congruences.
                             Let m = myjm2z-+--m, and, for 1 <j <k, let M; =m/m;. [So, for example, M, =
                          m2m3m,4 + - mand M2 = mjm3m4--- m,.] We findthat forall 1 < 7 <k,gcd(m,;, Mj) =
                          1. If not, then for some (fixed) j, with 1 < j <k, there exists a prime p such that p\m,
                          and p|M;. But from Lemma 4.3 it follows that if p|M; then p\m; for some 1 <i <k,
                          where i # j. Consequently, we find that p|m; and p|m; for i # j, and this contradicts
                          gcd(m;,   mj)   =   1.
                             Foreach 1 < j < k, gcd(m,;,        M;)    = 1. Consequently, from Theorem 14.14 we know that
                          M; isaunitin Z,,. So there exists x; € Z,,, such that M;x;                 = 1 (mod m;). Now consider
                          the sum

x=   ayMyx,    +   a2M   x2    tee       arMiXr.

We claim that x is a simultaneous solution of the system of k congruences. Note that for
                          1<j<kand1<i<k,ifi                   # j then M; =O            (modm;)    because m;|M;.   Hence   M;x; =
                          0 (mod m;). Since M;x; = 1 (mod m,;) we find that

x =a;M;x;          =a;   (modm,),

foreach 1 <j <k.
                                                                                14.4 Ring Homomorphisms and Isomorphisms                                703

Now      suppose that x, y are both simultaneous solutions of the system of k congru-
                ences. Then x = y (mod m,)                         for all 1 < 7 <k.           Consider the prime factorization of m =
                m \mz--+-m,. Let p be a prime such that p’|m but p’t' / m, for some t € Z'. Since
                ged(m,, m,)          = 1 foralll <i                < j <k, it follows that p'|m, for one (and only one) modulus
                m,. Consequently, we see that p’|(x — y), and so it follows from the Fundamental Theorem
                of Arithmetic that m|(x — y), or x = y (modm).

Now let us see how one can apply the Chinese Remainder Theorem.

In Marjorie’s fourth-grade arithmetic class, three students —— namely, Megan, Avery, and
EXAMPLE 14.22
                Elizabeth — enjoy doing long-division problems (without a calculator). So Marjorie selects
                a positive integer m and asks for the remainder upon division by three different divisors.
                Upon dividing by 31 Megan learns that the remainder is 14. Avery divides n by 32 and finds
                the remainder is 16. Meanwhile, Elizabeth obtains the remainder of 18 when she divides n
                by 33. What is the smallest value of n that Marjorie could have selected?
                   Here we seek a simultaneous solution for the three congruences

x = 14 (mod             31),              x =    16 (mod 32),                x = 18 (mod        33).

So   a,   =   14,   a2   =   16,   a3   =    18,    my,   =    31,   mo   =   32,   WR    =    33,   and   m   =   N|M     MA   =   32736.
                Further,      M;    = m/m,         = 1056,          M2 = m/m2             = 1023,        and    M3; = m/m;         = 992.       Using     the
                Euclidean algorithm (when necessary), as in Example 14.13, we learn that

[xi] = [Mi]! = [1056]! = (3431)
                                                  + 277! = [2]| = [16] in Z,,, = Zs,
                     [x2] = [Mo]! = [1023]~' = (31332) +31]-' = [3177' = [31] in Z,,, = Zo,                                                         and
                     [x3] = [M3]~! = [992]-! = [30(33) + 2]7! = [2]! = [17] in Z,,, = Za3.
                Hence,

x = (14)(1056)(16) + (16)(1023)(31) + (18) (992) (17) (mod 32736)
                                    = 1047504 (mod 32736)
                                    = 31(32736) + 32688 (mod 32736)
                                    = 32688 (mod 32736).

So the (smallest) positive integer n that Marjorie could have selected is 32688.
                   (As acheck we find that 32688 = 1054(31) + 14 = 1021(32) + 16 = 990(33) + 18, so
                x satisfies the given system of three congruences and is the smallest positive integer that
                does so.)
                     Now      if we look back at the isomorphism                          g: Z32736 —> Zs,            * Zs32 * Zs3 (that we men-
                tioned prior to stating the Chinese Remainder Theorem) we see that for the codomain
                element (14, 16, 18) in Z3; X Z32 X Zs3, the element 32688 in the domain Z32736 is the
                (unique) preimage. That is, (32688) = (14, 16, 18) and for any other integer y, if g(y) =
                (14, 16, 18), then y = 32688 (mod 32736)  — so 32688 is the only solution in {0, 1, 2, 3,
                ..., 32735}.
704               Chapter 14 Rings and Modular Arithmetic

The isomorphisms f (of Example 14.21) and g (of Example 14.22) are special cases ofa
                                             more general result’ that we shall now state. Ifn = njn2---ny, wheren;                      > 1foralll <i <
                                             k and ged(n;, nj) = 1 for all 1 <i <j <k, then the rings Z, and Z,, X Zn, X+** X Zy,
                                             are isomorphic. In particular, we know from the Fundamental Theorem of Arithmetic that
                                             foreachn € Z* — {1}, we can factor nas pj’ p;’- > + p;', where pi, 2, -.-. p; aret distinct
                                             primes, f > 1, ande), e2,..., e, € Z*. It then follows that the rings Z, and Z,,, X Zm, X
                                             -++ X Zin, are isomorphic form, = py’, m2 = py, ..., mM, = py.
                                                 As aresult of this isomorphism, arithmetic involving large integers (that exceed the word
                                             size of a given computer) can be performed using the smaller different moduli. Further,
                                             the computation for these smaller moduli can be carried out in parallel— thus, reducing
                                             computation time. [For more on the Chinese Remainder Theorem in conjunction with
                                             applications of residue arithmetic in computers, we direct the interested reader to pages
                                             146-149 of the text by K. H. Rosen [12], pages 344-359 of the text by J. P. Tremblay and
                                             R. Manohar [14], as well as the text by D. E. Knuth [8].

9. a) How many units are there in the ring Zs?
                              EXERCISES 14.4
                                                                                          b) How many units are there in the ring Z. X Z) X Z,?
1. If R is the ring of Example 14.6, construct an isomorphism                             c) Are Zs and Z, X Z, X Z, isomorphic rings?
f:R-> TZ.                                                                             10. a) How     many    units   are    there    in Z,;?   How   many   in
2. Complete the proofs of Theorems 14.15 and 14.16.                                      Z; X Zs?
3. If R, S, and T are rings and f: R-> S, g:S—-T                            are          b) Are Z; and Z; X Zs isomorphic?
ring homomorphisms, prove that the composite function g o f:

csff
                                                                                      11. Are Z, and the ring in Example 14.4 isomorphic?
R -» T is aring homomorphism.
                                                                                      12. If f: R + S isa ring homomorphism and J is an ideal of
                                  aeéR}, then S is a ring under matrix                S, prove that f~'(J) = {a € R|
                                                                                                                   f (a) € J} is an ideal of R.

addition and multiplication. Prove that R is isomorphic to S.                         13. Find a simultaneous solution for the system of two con-
                                                                                      gruences:
5. a) Let (R, +, -) and (S, @, ©) be rings with zero elements
    ze and zs, respectively. If f: R > S is a ring homomor-                                                    x =5 (mod 8)
    phism, let K = {a € R| f(a) = Zs}. Prove that K is an ideal                                                x = 73 (mod 81).
    of R. (K is called the kernel of the homomorphism f.)
                                                                                      14. A band of 17 pirates captures a treasure chest full of (identi-
      b) Find the kernel of the homomorphism in Example 14.19.                        cal) gold coins. When the coins are divided up into equal num-
      c) Let f, (R, 4+, +), and (S, 6, ©) be as in part (a). Prove                    bers, three coins remain. One pirate accuses the distributor of
      that f is one-to-one if and only if the kernel of f is {zx}.                    miscounting and kills him in a duel. As a result, the second
6. Use the information in Table 14.11 to compute each of the                         time the coins are distributed, in equal numbers, among the 16
following in Za.                                                                      surviving pirates, there are 10 coins remaining. An argument
                                                                                      erupts and leads to gun play, resulting in the demise of another
      a) (13)(23)
               + 18                             b) (11)(21)
                                                         — 20
                                                                                      pirate. Now when the coins are divided up, in 15 equal piles,
      ce) (13 + 19)(27)                         d) (13)(29) + (24)(8)                 there are no remaining coins. What is the smallest number of
7, a) Construct a table (as in Example 14.21) for the isomor-                        coins that could have been in the chest?
      phism     f: Zo)   >   Za    X   Zs.                                            15. Find a simultaneous solution for the system of four con-
      b) Use the table from part (a) to compute the following                         gruences:
      in Z 9.                                                                                                   x = 1 (mod 2)
          i) (17)(19) + (12)(14)
                                                                                                                x =2       (mod 3)
          ii) (18)(11) — (9)(15)
                                                                                                                 x =3      (mod 5)
  8. Letn, r,s € Zt withn, r,s >2,n=rs, and ged(r, s) =
l. If f:Z, > Z, X Z, is a ring isomorphism with f(a) =                                                           x =5 (mod 7).
(1,0) and f(6) = (0, 1), prove that if (m, 1) € Z, X Z,, then
f-'(m, t) = ma + th (mod n).

"In some textbooks this result is referred to as the Chinese Remainder Theorem.
                                                                  14.5 Summary and Historical Review         705

14.5
Summary and Historical Review
               Emphasizing structure induced by two closed binary operations, this chapter has introduced
               us to the mathematical system called a ring. Throughout the development of mathematics,
               the ring of integers has played a key role. In the branch of mathematics called number
               theory, we examine the basic properties of (Z, +, -), as well as the finite rings (Z,, +, «).
               The matrix rings provide familiar examples of noncommutative rings.

Pierre de Fermat (1601-1665)                         Sophie Germain (1776-1831)

This chapter contains the development of an abstract theory. On the basis of the definition
               of a ring, we established principles of elementary algebra that we have been using since
               our early encounters with arithmetic, signed numbers, and the manipulation of unknowns.
               The reader may have found some of the proofs tedious, as we justified all the steps in the
               derivations. Faced with the challenge of trying to prove a result in abstract mathematics,
               one should follow the advice given by the Roman rhetorician Marcus Fabius Quintilianus
               (first century A.D.), when he said, “One should not aim at being possible to understand (or
               follow), but at being impossible to be misunderstood.”
                   A famous problem in number theory, known as Fermat’s Last Theorem, claims that
               the equation x" + y” =z", ne Z*, n > I, has no solutions in Z* when n > 2. In 1637
               the French mathematician Pierre de Fermat (1601—1665) wrote that he had proved this
               result but that the proof was too long to be included in the margin of his manuscript.
               Many renowned mathematicians of the eighteenth and nineteenth centuries tried to prove
               this result— among them Leonhard Euler (1707-1783), Peter Gustav Lejeune Dirichlet
               (1805-1859), Carl Friedrich Gauss (1777-1855), Sophie Germain (1776-1831), Adrien-
               Marie Legendre (1752-1833), Niels Henrik Abel (1802-1829), Gabriel Lamé (1795-1870),
               and Leopold Kronecker (1823-1891). Although unsuccessful, attempts to resolve Fermat’s
               claim did result in new mathematical ideas and theories. The twentieth century also produced
               scholars who expended tremendous efforts on this problem. One such scholar was born in
               Cambridge, England, in 1953. There, at the age of 10, he went to the public library in his
               town and looked into a book on mathematics. As he read about Fermat’s Last Theorem,
               it seemed so simple     — and he wanted to prove it. In the 1970s Andrew Wiles went to
               Cambridge University, and after he finished his degree, he became a research student there,
706   Chapter 14 Rings and Modular Arithmetic

working in number theory —in         an area called Iwasawa theory. For at this time Fermat's
                       Last Theorem     was not in fashion. When        Wiles completed his doctorate, he moved to the
                       United States, to a position at Princeton University. In the 1980s his enthusiasm for his
                       childhood dream was rekindled and he spent close to seven years working alone— locked
                      up in his attic office. He finally confided in his colleague Nick Katz — in January 1993. Then
                      in June 1993 Professor Wiles returned to Cambridge to deliver a series of three lectures
                      at a number-theory conference. The last lecture ended in grand applause, accompanied by
                      flashing cameras and reporters’ questions. It appeared that he had solved Fermat’s Last
                      Theorem. Unfortunately, when his 200-page write-up was peer-reviewed, by experts such
                      as Nick Katz, problems started to arise, and a hole in the proof caused everything to collapse
                      like a house of cards. The fall of 1993 found Wiles back at Princeton        — now crestfallen,
                      angry, and humiliated. But then, after renewed effort, on September 19, 1994, he took one
                      last look at his proposed proof. The next morning he wrote up a new proof, as everything
                      fell into place. This time no one could find any flaws. The May 1995 issue of the journal
                      Annals of Mathematics contains the original Cambridge paper by Andrew Wiles and the
                      correction by Wiles and his friend and former student Richard Taylor. At last Fermat’s Last
                      Theorem was laid to rest. (Although Wiles gets much of the praise, other mathematicians
                       deserve   accolades      as well — among   them,    Kenneth       Ribet, Barry   Mazur,   Goro   Shimura,
                       Yutaka Taniyama, Gerhard Frey, Matthias Flach, and Richard Taylor.) For more on the
                       history and development of the proof of this famous theorem, the reader is directed to the
                       very readable account given by A. D. Aczel [1].

Andrew John Wiles (1953- }
                                                                  AP/Wide World Photos

In trying to prove Fermat’s Last Theorem, the German mathematician Ernst Kummer
                       (1810-1893) developed the foundations for the concept of the ideal. This concept was later
                       formulated, named, and utilized by his countryman Richard Dedekind (1831—1916) in his
                       research on what are now called Dedekind domains. Use of the term “ring,” however, seems
                       to be attributable to the German mathematician David Hilbert (1862-1943).
                           Ring homomorphisms and their interplay with ideals were extensively developed by
                       the German mathematician Emmy Noether (1882-1935). This great genius received little
                       remuneration, financial or otherwise, from the governing bodies of her native land because
                                                 14.5 Summary and Historical Review       707

of the sexual bias that was prevalent in the universities at that time. Emmy Noether’s talents
were nonetheless recognized by her colleagues, and she was eulogized in the New York
Times on May   3, 1935, by Albert Einstein (1879-1955), who      acknowledged the influence
and importance of her work for the development of relativity theory. In addition to enduring
sexual bias, as a Jew she was forced to flee her homeland in 1933, when the Nazis came to
power. She spent the last two years of her life guiding young mathematicians in the United
States. For more on the life of this fascinating person, examine the biography by A. Dick
[4] and the article by C. Kimberling [7].
   The special rings called fie/ds arise in the rational, real, and complex number systems.
But we also saw some interesting finite fields. These structures will be examined again
in Chapter 17 in connection with combinatorial designs. The field theory developed by
the French genius Evariste Galois (1811-1832) answered questions about the solutions
of polynomial equations of degree > 4. These questions had baffled mathematicians for
centuries, and his ideas, now known    as Galois theory, still comprise one of the most ele-
gant mathematical theories ever developed. More on Galois theory appears in the text by
O. Zariski and P. Samuel   [16].

Emmy Noether (1882-1935)

For supplemental reading on ring theory at the introductory level, the interested reader
should examine Chapters 12-18 of J. A. Gallian [5], Chapter 6 of V. H. Larney [9], and
Chapters 6, 7, and 12 of N. H. McCoy and T. R. Berger [10]. A somewhat more advanced
coverage can be found in Chapter 4 of the text by E. A. Walker [15].
    The development of modular congruence, along with many related ideas, we owe pri-
marily to Carl Friedrich Gauss. Problems involving systems of congruences date back to the
late first century where they appear in the work of the Greek mathematician Nicomachus
of Gerasa. Systems of two congruences can also be found in the writings of the seventh-
century mathematician Brahmagupta (born in 1598 in northwestern India). However, it was
not until 1247 that we find the publication of a general method for solving systems of linear
congruences. In his Shushu jiuzhang (Mathematical Treatise in Nine Sections), the method
now called the Chinese Remainder Theorem is presented by the Chinese mathematician
Qin Jiushao (c. 1202-1261). Born in the province of Sichuan during the time of Genghis
Khan, this remarkable mathematical talent was also an accomplished architect, musician,
and poet, as well as being quite the sportsman
                                            — in archery, fencing, and horsemanship.
708            Chapter 14 Rings and Modular Arithmetic

More on the solution of congruences and the Chinese Remainder Theorem can be found in
                                 the texts by I. Niven, H. S. Zuckerman, and H. L. Montgomery                          [11] and K. H. Rosen [12].
                                    As mentioned earlier (in the footnote in Example 14.16), more on the history, develop-
                                 ment, and applications of cryptology can be found in the texts by T. H. Barr [3], P. Garrett
                                 [6], and W. Trappe and L. C. Washington                [13].
                                     Finally, the topic of hashing, or scattering, can be further investigated in Chapter 2 of
                                 J. P. Tremblay and R. Manohar [14]. Chapter 4 of A. V. Aho, J. E. Hopcroft, and J. D.
                                 Ullman [2] includes a discussion on the efficiency of hashing functions and a probabilistic
                                 investigation of the collision problem that arises for these functions.

REFERENCES
                                     1. Aczel, Amir D. Fermat's Last Theorem:             Unlocking the Secret of an Ancient Mathematical
                                        Problem. New York: Four Walls Eight Windows, 1996.
                                      . Aho,   Alfred V., Hopcroft,   John   E., and Ullman,       Jeffrey D. Data      Structures and Algorithms.
                                        Reading, Mass.: Addison-Wesley, 1983.
                                      . Barr, Thomas H. /nvitation to Cryptology. Upper Saddle River, N.J.: Prentice-Hall, 2002.
                                    W

. Dick, Auguste. Emmy Noether (7882—1935), trans. Heidi Blocher. Boston: Birkhauser- Boston,
                                        1981.
                                      . Gallian, Joseph A. Contemporary Abstract Algebra, 5th ed. Boston: Houghton Mifflin, 2002.
                                      . Garrett, Paul. Making, Breaking Codes; An Introduction to Cryptology. Upper Saddle River,
                                        N.J.: Prentice-Hall, 2001.
                                      . Kimberling,   Clark. “Emmy     Noether,    Greatest Woman          Mathematician.” Mathematics       Teacher
                                        (March 1982): pp. 246-249.
                                      . Knuth, Donald Ervin, The Art of Computer Programming,                   3rd ed., Volume 2, Semi-Numerical
                                         Algorithms. Reading, Mass.: Addison-Wesley, 1997.
                                      . Larney, Violet Hachmeister. Abstract Algebra: A First Course. Boston: Prindle, Weber &
                                         Schmidt, 1975.
                                   10. McCoy, Neal H., and Berger, Thomas R. Algebra: Groups, Rings and Other Topics. Boston:
                                         Allyn and Bacon, 1977.
                                   11. Niven, Ivan, Zuckerman, Herbert S., and Montgomery, Hugh L. An Introduction to the Theory
                                         of Numbers, 5th ed. New York: Wiley, 1991.
                                   12. Rosen, Kenneth H. Elementary Number Theory, 4th ed. Reading, Mass.: Addison-Wesley, 1999.
                                   13. Trappe, Wade, and Washington, Lawrence C. introduction to Cryptography with Coding Theory.
                                         Upper Saddle River, N.J.: Prentice-Hall, 2002.
                                       . Tremblay, Jean-Paul, and Manohar, R. Discrete Mathematical Structures with Applications
                                         to Computer Science. New York: McGraw-Hill, 1975.
                                      . Walker, Elbert A. Introduction to Abstract Algebra. New York: Random House/Birkhéuser,
                                          1987,
                                      . Zariski, Oscar, and Samuel, Pierre. Commutative Algebra, Vol. 1. Princeton, N.J.: Van Nostrand,
                                         1958.

SUPPLEMENTARY EXERCISES
                                                                                  c) If (R, +, +) is a ring with unity wr, and S is a subring
                                                                                  of R with unity us, then wp = us.
                                                                                  d) Every field is an integral domain.
                                                                                  e) Every subring of a field is a field.
  1. Determine whether each of the following statements is true
or false. For each false statement give a counterexample.                         f) A field can have only two subrings.

a) If(R, +, -)isaring, and¥ # S$ C R with S closed under                    g) Every finite field has a prime number of elements.
      + and -, then S is a subring of R.                                          h) The field (Q, +, +) has an infinite number of subrings.
      b) If (R, +, -) is aring with unity, and S is a subring of R,          2. Prove      that   a ring    R    is   commutative   if and   only   if
      then S$ has a unity.                                                   (a+b) =a*+2ab+b*, foralla, be R.
                                                                                                                                       Supplementary Exercises                709

3. Aring R is called Boolean if a* =a for alla € R. If R is                                    for   some     | <i      <n,   or    there   exist   1 <i   < j <n    such   that
Boolean, prove that (a)a@ + a = 2a = z,foralla € R,;and(b)R                                      Alig Fer            + Xjy-1 + X;).
is commutative.                                                                                  12. Consider the ring (Z*, @, ©) where addition and multipli-
4. With C the field of complex                        numbers            and S the ring of      cation are defined by (a, b,c) @ (d, e, f) = (a+d,b+e,
                                                                                                 c+ f)and (a, b, c) © (d, e, f) = (ad, be, cf). (Here, for ex-
2 X 2 real matrices of the form E                             2     define f: C >         S by
                                                                                                 ample, a + d and ad are computed by using the standard binary
                                                                                                 operations of addition and multiplication in Z.) Let S be the sub-
fiat+bi)=           E               ? |. for      + bi €C. Prove that f is a ring
                                                                                                 set of Z? where S = {(a, b, c)|a = b +c}. Prove that S is not
isomorphism.                                                                                     a subring of (Z’, @, ©).
  5. If (R, +, -) is a ring, prove that C = {r ¢ R\ar = ra, for                                  13. a) In how         many     ways can one select two positive inte-
alla € R} isa subring of R. (The subring C is called the center                                        gers m,n,       not necessarily        distinct,   so that   1 < m < 100,
of R.)                                                                                                 1 <n < 100 and the last digit of 7” + 3” is 8?
6. Given         a finite field F, let M2(F)                     denote the set of all                b) Answer part (a) for the case where 1 <m                     < 125, 1 <
2x2         matrices         with     entries   from     F.   As      in    Example   14.2,            n< 125.
(M2(F), +, +) becomes a noncommutative ring with unity.
                                                                                                       c) If one randomly selects m, n [as in part (a)], what is the
      a) Determine the number of elements in M2(F) if F is                                             probability that 2 is now the last digit of 7” + 3”?
       i)     Z        ii)     Z         iii)   Z,, p aprime                                     14. Letn € Z* withn > 1.

[Sls
                                                                                  a   b                a) If n = 2k where k is an odd integer, prove that
      b) As       in Exercise           13 of Section             14.1,     A=
      M,(Z,) is a unit if and only if ad — be # z. This occurs if                                                                     k* =k (moda).
      the first row of A does not contain all zeros (that is, z’s) and                                 b) If n = 4k for some k € Z*, prove that
      the second row is not a multiple (by an element of Z,,) of
      the first. Use this observation to determine the number of
                                                                                                                                     (2k)* = 0 (mod n).
      units in                                                                                         c) Prove that
       i} Mz(Z2)_—                  iit) M2(Zs)        iii)   M2(Z,), p a prime                               n— ]
                                                                                                                     3=    | 5 (mod n),         for n even with . odd,
7, Given an integral domain                      (D, +, -) with zero element z,                              =]            0 (mod n),          otherwise.
leta, b € Dwithab ¥ z. (a) Ifa? = b’ anda? = Bb, prove that
a=b.(b) Letm,n € Z* with gcd(m, n) = 1. Ifa” = b” and                                            15. Suppose that a, b, c € Z and 5|(a? + b* +c’). Prove that
a" = b", prove thata = Bb.                                                                       5la or 5) or SIc.
8. Let A = R*. Define 6 and © on A by a@b = ab, the                                             16. Write a computer program (or develop an algorithm) that
ordinary product of a, b; anda © b = a2",                                                        reverses the order of the digits in a given positive integer. For
      a) Verify that (A, ®, ©) is a commutative ring with unity.                                 example, the input 1374 should result in the output 4731.

b) Is this ring an integral domain or field?                                               17. Suppose thata, b, k € Z* witha — b = pj' py --- pi‘, for
                                                                                                 Pi, P2,-.-, Py prime and €, é2,..., e, € Z*. For how many
  9. Let R be a ring with ideals A and B. Define A+ B=
                                                                                                 values ofn (> 1) is@ = 6 (mod n) true?
{a+ bla ¢ A, b€ B}. Prove that A + B is an ideal of R. (For
any ring R, the ideals of R form a poset under set inclu-                                        18. As the co-chairs of the Homecoming Parade Committee,
sion. IfA and B are ideals of R, with glb{A, B} = AM B and                                       Jerina and Noor must organize the freshmen for a pregame
lub{A, B} = A+ B, the poset is a lattice.)                                                       presentation. When they arrange these students in rows of 8,
                                                                                                 there are three students remaining. When rows of 11 are tried,
10. a) If p is a prime, prove that p divides (?), for all 0 <
                                                                                                 four students remain. Finally, rows of 15 leave five students
    k< p.
                                                                                                 remaining. So the co-chairs use the rows of 15 and place the
      b) Ifa, b € Z, prove that (a + b)? =a? + b? (mod p).                                       remaining five students at the center (in positions 6-10) of the
11. Given nr positive                   integers x), %2,...,X,, not meces-                       first row. What is the smallest number of freshmen Jerina and
sarily distinct, prove                   that either n|(x; +x, +-:-+4+4%,),                      Noor are trying to organize?
            15
Boolean Algebra
and Switching
    Functions

gain we encounter an algebraic system in which the structure depends primarily on two
                     closed binary operations. Unlike the situation for rings, in dealing with Boolean algebras
                we shall stress applications more than the abstract nature of the system. Nonetheless, we
                shall carefully examine the structure of a Boolean algebra, and in our study we shall find
                results that are quite different from those for rings. Among other things, a finite Boolean
                algebra must have 2” elements, for some € Z*. Yet we know of at least one ring for each
                m €Z*,m > 1—namely, the ring (Z,,, +, -).
                   In 1854 the English mathematician George Boole published his monumental work
                An Investigation of the Laws of Thought. Within this work Boole created a system of
                mathematical logic that he developed in terms of what is now called a Boolean algebra.
                   In 1938 Claude Elwood Shannon developed the algebra of switching functions and
                showed how its structure was related to the ideas established by Boole. As a result of this
                work, an example of abstract mathematics in the nineteenth century became an applied
                mathematical discipline in the twentieth century.

15.1
Switching Functions: Disjunctive
and Conjunctive Normal Forms
                An electric switch can be turned on (allowing the flow of current) or off (preventing the flow
                of current). Similarly, in a transistor, current is either passing (conducting) or not passing
                (nonconducting). These are two examples of two-state devices. (In Section 2.2 we saw how
                the electric switch was related to the two-valued logic.)
                    In order to investigate such two-state devices, we abstract these notions of “true” and
                “false,” “on” and “off,” as follows.
                   Let B = {0, 1}. We define addition, multiplication, and complements        for the elements
                of B by

ee          O+0=0;         OF1=14051+141°                      |
                     b)               -         0-0=021-020-1;                 1611                 .
                      2                               O=1;  T=0,                         *

711
712          Chapter 15 Boolean Algebra and Switching Functions

A variable x is called a Boolean variable if x takes on only values in B. Consequently,
                             x +x =x and x? =x-x = xx =x for every Boolean variable x.
                                    If x, y are Boolean variables, then

1) x + y = Oif and only
                                                          if x = y = 0, and
                                    2) xy = lifand only ifx = y=                       1.
                                  If ne Zt, BY = {(by, bo, ..., by) |b; € {0, 1}, 1 <i <n}. A function f: B” > B is
                              called a Boolean, or switching, function of n variables. The n variables are emphasized
                              by writing f(x), X2,...,                Xn), where each x;, for 1 <i                   <n, is a Boolean variable.

Letf: B’ > B,where f(x. y, z) = xy +z.' (Wewrite xy forx « y.) This Boolean function
      EXAMPLE 15.1
                              is determined by evaluating f for each of the eight possible assignments to the variables x,
                              y, z, as Table 15.1 demonstrates.

Table 15.1

x]          ylzixy}]              fa      y,z)=xv4+z
                                                                 0;0/]0]                    0                   0
                                                                 0;}O0/]1]                  0                    1
                                                                 Oo;  1/0;                  0                   0
                                                                 QO;         1/1            0                   1
                                                                 1};0]/0);                  0                   0
                                                                 1}          0/1            0                   1
                                                                 1           1 | 0          1                   1
                                                                 1]          141            1                    ]

Definition 15.1         For n € Zt,       n > 2, let f, g: B” >                  B be two Boolean               functions   of the n Boolean   vari-
                              ables x,, X2,..., X,. We say that f and g are equal and write f = g if the columns for
                              f. g lin their respective (function) tables] are exactly the same. [The tables show that
                              f(b), b2,..., bn) = g(b1, bo, ..., b,) for each of the 2” possible assignments of either0
                              or 1 to each of the nm Boolean variables x;, x2, .... Xn-]

Definition 15.2         If f: B® > B, then the complement of f, denoted f, is the Boolean function defined on
                              B" by

Sf (b1, Bayo. Bn) = F (Br, ba, «5 Bn):
                              If g: B’ >       B, we define f + g, f -g: B" —                            B, the sum and product of f, g, respec-
                              tively, by

(f + g)(h1, b2,..., b,) = f(b, b2,..., bn)
                                                                                      + g(b1,                                        b2,..., bn)
                              and

(f   - g)(di,   bo,     see   y   b,)   =    f(b,   bo,    sey   by)    - g(D,, bo,   sey   by).

TWhen dealing with Boolean variables multiplication is performed before addition. Hence xy + z represents
                              (xy) +z, not x(y +2).
                                     15.1   Switching Functions: Disjunctive and Conjunctive Normal Forms           713

Ten laws—important            consequences     of    these   definitions—are       summarized       in
               Table 15.2.

Table 15.2

) f=f                                  =x                                  Law of the Double
                                                                                                  Complement
                    2) f+a= fs                             EF y¥ HTP                           DeMorgan’s Laws
                       fer f+                              xy e+
                    3) f+es=e+f                            xbys ye                             Commutative
                       fg = sf.                            xy yx                                  Lawa:
                    4) f+er+ h) = (Freyth                  xe   tag=&+y)+z                     Associative Laws
                   2 Phy         (fehl)                               = Gye                                     }
                    5) ft eh Se as+h)                     x+yz=(e+y)x+2)                       Distributive Laws
                       fetahy= fet fh                     ky +2) = xy + xz                                  |
                    6) f+f=f                               X+xX=xX                             idempotent Laws
                        if=f     -                         KX =X
                    7) f+0= f/f                            x+O0=x%                             Identity Laws
                        f-l=f               :              x+l= x
                    8) f+frui         -                    x+%=1                               Inverse Laws
                        ff=0                               xx¥ = 0
                    9 f+i=i                                x+1=1                               Dominance Laws
                     | f+0=0                               x0 =0
                  10) f +f ge f                            key =X                              Absorption Laws
                      flfee= Ff                            x+y) =x

As with the laws of logic (in Chapter 2) and the laws of set theory (in Chapter 3), the
               properties shown in Table 15.2 are satisfied by all Boolean functions f, g, h: B” — B and
               by all Boolean variables x, y, z. (We write fg for f + g.)
                  The symbol 0 denotes the constant Boolean function whose value is always 0, and 1 is
               the function whose only value is 1. (Note: 0,1 ¢ B.)
                   Once again the idea of duality appears in properties 2—10. If s stands for a theorem about
               the equality of Boolean functions, then s¢, the dual of s, is obtained by replacing in s all
               occurrences of + (+) by - (+) and all occurrences of 0 (1) by 1 (0). By the principle of
               duality (which we shall examine in Section 15.4) the statement s“ is also a theorem. The
               same is true for a theorem dealing with the equality of Boolean variables, except here it is
               the Boolean values 0 and 1 that are replaced, not the constant functions 0 and 1.

The principle of duality is handy for establishing property 5 of Table 15.2 for Boolean
               functions and Boolean variables.

The Distributive Law of + over +. The last two columns of Table 15.3 show that f +
EXAMPLE 15.2   gh =(f +2) +h). We also see that x + yz = (x + y)(x +2) is a special case of this
               property for the situation where f, g, h: B> —          B, with f(x, y, z) =x, g(x, y, z) = y, and
               h(x, y, z) = z. Hence no additional tables are needed to establish this property for Boolean
               variables.
714         Chapter 15 Boolean Algebra and Switching Functions

Table 15.3
                                           flgelh|eh|              f+e | ft+h | fteh | (f+af +h)
                                           0/0/0]    0               0     0      0        0
                                           o|ol]1]   0               0      1     0        0
                                           O}1/0/]   0               1     0      0        0
                                           Ol1i1/1                   1     1      1        1
                                           1/0/0]    0               1     1               1
                                           1/0/11!   0                |     |      1       1
                                           1/1/0!    0                1    1      1        1
                                           1}a}i}                    1     1               1

By the principle of duality, we obtain f(g +h) = fg + fh.

a) To establish the first absorption property for Boolean variables, instead of relying on
      EXAMPLE 15.3
                                  table construction we argue as follows:
                                                                        Reasons
                                          X+txy=xltxy                   Identity Law
                                                        x(1+y)          Distributive Law of +» over +
                                                      = xl              Dominance Law (and Commutative Law of +)
                                                      =x                Identity Law
                                      This result indicates that some of our laws can be derived from others. The question
                                   then is which properties we must establish with tables so that we can derive the other
                                   properties as we did here. We shall consider this later in Section 15.4 when we study
                                   the structure of a Boolean algebra.
                                      In the meantime, let us demonstrate how the results of Table        15.2 can be used to
                                   simplify another Boolean expression.
                               b) Simplify the expression wx + xz + (y +2Z), where w, x, y, and z are Boolean vari-
                                  ables.
                                     7                                                      Reasons
                                wx +xz+(y        +z) =wx+4%4+2)+04+2)                       DeMorgan’s Law
                                                           =wxt+(x+z)+(04+2)                Law of the Double Complement
                                                           = [(wx +x) +7] +(y9 +2)          Associative Law of +
                                                           = (x +2z)+   (9 +2)              Absorption Law (and the
                                                                                               Commutative Laws of + and -)
                                                             x+(z+z)+y                      Commutative and
                                                                                               Associative Laws of +
                                                           =x+Zz+y                          Idempotent Law of +

Up to this point we have repeated for Boolean functions what we did in Chapter 2 for
                             statements. When given a Boolean function (in algebraic terms), we construct its table
                             of values. Now we consider the reverse process: Given a table of values, we shall find a
                             Boolean function (described in algebraic terms) for which it is the correct table.
                                       15.1 Switching Functions: Disjunctive and Conjunctive Normal Forms        715

Given three Boolean variables x, y, z, find formulas for functions f, g, h: B* —          B for the
EXAMPLE 15.4
                  columns specified in Table 15.4.
                     For the column under f we want a result that has the value 1 only in the case where
                  x = y =O and z = 1. The function f(x, y, z) =X yz is one such function. In the same
                  way, g(x, y, Z) = xyZ yields the value 1 for x = 1, y = z = 0, and is 0 in all other cases.
                  As each of f and g has the value 1 in only one case and these cases are distinct from
                  each other, their sum f + g has the value 1 in exactly these two cases. So A(x, y, z) =
                  T(x, y, z) + g(x, y, Z) =X yz + xyZ has the column of values given under h.

Table 15.4

x         y      z        f        8               h
                                             0        0      0        0        0               0
                                             0        0       1       1        0               1
                                             0         l     0        0        0               0
                                             0        1      ]        0        0               0
                                             l        0      0        0         1              1
                                             1        0      1        0        0               0
                                             1         ]     0        0        0               0
                                             I        1       1       0        0               0

This example leads us to the following definition.

Definition 15.3   For all n € Z*, if f is a Boolean function on the ” variables x), x2, ..., X,, we call

a) each term x; or its complement x;, for 1 <i <n, a            literal;
                    b) aterm of the form y; y2--- y,, where each y; = x; or X;, for 1 <i <n, a fundamental
                       conjunction, and
                    c) a representation of f as a sum of fundamental conjunctions a disjunctive normal
                       form (d.nf.) of f.

Although no formal proof is given here, the following examples suggest that each
                  f: B’ > B, f #0, has a unique (up to the order of fundamental conjunctions) repre-
                  sentation as a d.n.f.

Find the d.n.f. for f: B? > B, where f(x, y, z) = xy + Xz.
EXAMPLE 15.5
                      From Table 15.5, we see that the column for f contains four 1’s. They indicate the
                  four fundamental conjunctions needed in the d.n.f. of f, so f(x, y,z) =X yz +xyz+
                  xXyZ+Xxyz.
                      Another way to solve this problem is to take each product term appearing in f — namely,
                  xy and xz—and somehow involve whichever variables are missing. Using the proper-
                  ties of these variables, we have xy + XZ = xy(Z+Z) +. X(y + y)z (Why?) = xyz + xyzZ+
                  XyZ+X yz.
716         Chapter 15   Boolean Algebra and Switching Functions

Table 15.5

x         y         z          xy         XZ       f
                                                              0       0         0          0          0        0
                                                              0       0         ]          0           ]        ]
                                                              0        l        0          0          0        0
                                                              0       1         1          0           1        1
                                                              1       0         0          0          Q        Q
                                                              1        0        1          0          0        0
                                                              1        ]        0           ]         0        1
                                                              1        ]        1          ]          0        1

Find the d.n.f. for g(w, x, y, Z) = wxy + wyz+ xy.
      EXAMPLE 15.6
                                 We examine each term, as follows:

a) wxy = wxy(Z+Z) = wxyzt+ wxyZ
                                b) wyZ = w(x + X)yzZ = wxyzZ + wxyz
                                 c) xy = (w+ w)xy(z+Z)                = wxyz t+ wxyzZ + wxyz t+ Wxyz
                                  It follows from the idempotent property of + that the d.n.f. of g is

g(w, x,y,z) = wxyz+ wxyzZt+wxyz t+ wxyzZ + wxyz+ wWxyzZ t+ Wxyz.

Consider the first three columns in Table 15.6. If we agree to list the Boolean variables
                              in alphabetical order, we see that the values for x, y, z in any row determine a binary label.
                              These binary labels for 0, 1, 2,..., 7 arise forrows 1, 2,...,                  8, respectively, as shown in
                              columns 4 and 5 of Table 15.6. [We note, for instance, that the first row has row number |
                              but binary label 000 (= 0). Likewise, the seventh row — where x = 1, y = 1, z = O—has
                              row number 7 but binary label 110 (= 6).] As a result, the d.n.f. of a nonzero Boolean
                              function can be expressed more compactly. For instance, the function f in Example 15.5
                              can be given by f = }° m(1, 3, 6, 7), where m indicates the minterms (that is, fundamental
                              conjunctions — each here on three literals) at rows 2, 4, 7, 8, with the respective binary labels
                              1, 3, 6, 7. The word minterm is used here to emphasize that the fundamental conjunction
                              has the value 1 a minimal number of times — namely, one time — without being identically
                              0. For example, m(1) denotes the minterm for the row with binary label 001 (= 1) where

Table 15.6
                                                    x             y        Zz       Binary Label           Row Number

0             0        0         000    (=   0)                 1
                                                    0)            0         l        001    (=   1)                 2
                                                    0             1        0         010    (=   2)                 3
                                                     0            1         1        O11    (=   3)                 4
                                                     l            0        0         100    (=   4)                 5
                                                     1            0        1         101 (=5)                       6
                                                     1            1        0         110 (= 6)                      7
                                                     1            1        1         111 (=7)                       8
                                                 15.1 Switching Functions: Disjunctive and Conjunctive Normal Forms            717

x = y = 0 and z = J; this corresponds with the fundamental conjunction x yz, which has
                        the value 1 for exactly one assignment (where x = y = O and z = 1).
                              Lacking a table, we can still represent the d.n.f. of the function g of Example 15.6, for
                        instance, as a sum of minterms. For each fundamental conjunction c,c2¢3c¢4, where c) = w
                        orwW,...,c¢4 = zorz, wereplace eachc;, 1 <i <4, by Oifc; is a complemented variable,
                        and by 1 otherwise. In this way the binary label associated with that fundamental conjunc-
                        tion is obtained. As a sum of minterms, we find that g =           - m(6, 7, 10, 12, 13, 14, 15).

Dual to the disjunctive normal form is the conjunctive normal form, which we discuss
                        before closing this section.

Let f: B* > B be given by Table 15.7. A term of the form c; +c¢2 +3, where c; = x
        EXAMPLE 15.7    or X, C2 = y or y, and c3 = z or Z, is called a fundamental disjunction. The fundamental
                        disjunction x + y + z has value | in all cases except where the value for each of x, y, z is
                        0. Similarly, x + y + z has value 1 except when x = z = Oand y = 1. Since each of these
Table 15.7              fundamental      disjunctions has the value 0 in only one case, and these cases do not occur
                        simultaneously, the product (x + y + z)(x + y +z) has the value 0 in precisely the two
xi      ylzif          cases just given. Continuing in this manner, we may represent the function f as
  0|;0;/01]   0
  0;0/1]1                                             f=@+y4t20+y¥4+2zIG@+y4+2)
  0};1;]0]  0
                        and we call this the conjunctive normal form (c.n.f.) for f.
  Oi     1]1)]1
                            Since the fundamental disjunction x + y + z has the value | a maximum number of
  1|/0);)0] 1
                        times (without being identically 1), it is called a maxterm, especially when we use a binary
  1;0/]/141
                        row label to represent it. Using the binary labels to index the rows of the table, we may
  1}     1)    0]   0
                        write f = || M(O, 2, 6), a product of maxterms.
  1);    1)1        ]
                            Such a representation exists for each f # 1, and it is unique up to the order of the
                        fundamental disjunctions (or maxterms).

Let    g: Bt >   B,   where   g(w, x, y,z) =(w+x+y)\(x             +¥4+2z)(w4+¥).        To   obtain   the
        EXAMPLE 15.8
                        c.n.f. for g, we rewrite each disjunction in the product as follows:

aywt+xt+yrwtxtytOrwt+tx+y+7z
                                  =(wt+xt+yt+z)(wt+x+yt+2Z)
                          b)x+ytzr=ww+x+yrzr=(w+x+y+zwW+rxe+y+zZ)
                          chwtyH=wt+axaxty=(w+xty(wt+xt+y)
                                         =(w+xt+yt7z)(wt+x+
                                                          yt zz)
                                         =(wt+x+ytz(w+x+ytzZwt+xtyrzy(w+xt+ytZ)
                              Consequently, using the idempotent law of-, we have g(w, x, y, Zz) =(w+x+y4+2Z)-
                        (w+xtytzZ(w+xtytzyw+ex+ytzy(wt+x+yt+Z(w+x+ytz)-
                        (w+x+y+7Z).
                           To obtain g as a product of maxterms, we associate with each fundamental disjunction
                        d, +d) +43 + d, the binary number b,b2b3b4, where b; = 0 if dj = w; b; = 1 if d; =
                        W;...3 by = Oif dy =z; by = Lif dy =Z. As aresult,g =|] M(0, 1, 2, 3, 6, 7, 10).

Our last example in this section reviews what we have learned about the ways to represent
                        a nonconstant Boolean function f (that is, f # O and f # 1).
718           Chapter 15 Boolean Algebra and Switching Functions

If h(w, x, y, z) = wx + Wy + xyz, then we may rewrite each summand in /: as follows:
      EXAMPLE 15.9
                                    i) wx = wx(y + y)(Z+Z) = wxyz + wxyz + wxyz + wxyz
                                    ii) Wy = W(x +X) y(2 +7) = Wxyzt+ WxyZ+ Wxyz+ WXYZ
                                   iil) xyz = (wWt+w)xyz = wxyzt+Wwxryz
                               Using the idempotent law of +, we find that the d.n-f. for A is
                                      wxyZ t+ wxyZ t+ wxyz + wxyz t+ wxyz + wxyz+ wWxyzt+ wWxyzt+ wrxyz.
                               Considering each fundamental conjunction in the d.n.f. for h, we obtain the following bi-
                               nary labels and minterm numbers:
                                        wxyz:       1111 (= 15)          wxyz:         1100 (= 12)         wxyz:       0011 (= 3)
                                        wxyz:       1110 (= 14)          wxyz:         Olll (=7)           wxyz:       0010 (= 2)
                                        wxyz:       1101 (= 13)          wxyz:         0110 (= 6)          wxyz:       1011 (= 11)

So we may        write h =   -     m(2, 3,6, 7, 11, 12, 13, 14, 15). And         from     this representation
                               using minterms we have             = I]   M(0,    1, 4, 5, 8, 9, 10), a product of maxterms.
                                  Finally, we take the binary label for each maxterm                    and determine its corresponding
                               fundamental disjunction:
                                                 0=0000:          wt+xt+y+z                   8=1000:      w+xt+yt+z
                                                 1=0001:          w+xt+y+zZ                  9=1001:       W+xt+y+z
                                                 4=0100:          w+x+yt+z                   10= 1010:     w+x+y+z
                                                 5=0101:          w+x¥+yt+2Z
                               This tells us that the c.n.f. for / is
                                (Wtx+y+z(wt+xtytZ(wt+txtytz(wt+x+yt+zZ):
                                                       (w+xtytzWwt+xtyt+zZWwt+xetytz).
                               Hence,

wWxyZ + WXYZ + WXYZ + WXYZ + WXYZ + WXYZ + WXYZ + WXYZ + WXYZ =

y~ m(2, 3, 6, 7, 11, 12, 13, 14, 15) = [] MC, 1, 4, 5, 8, 9, 10) =
                                            (w+xty+tz(w+xrty+Zw+xty+z(w+xtyt+z)-:
                                                       (w+tx+tytzwerxtytzwtx«+ytz).

ments of values for w and y that will result in the value 1 for
                        EXERCISES 15.1                                   the expression.

1. Find the value of each of the following Boolean expressions                a) x+xy+w                      b) xy +w
if the values of the Boolean variables w, x, y, and z are 1, 1, 0,              c) xy +xw                      d) xy+w
and 0, respectively.                                                      3. a) How many rows are needed to construct the (function)
      a) xy+xy           b) w+xy                ec) wx + ¥t+ yz                 table for a Boolean function ofn variables?
      d) (wx +yZ)+wy+(wt+y)@+y)                                                 b) How many different Boolean functions of n variables
2. Let w, x, and y be Boolean variables where the value of x                   are there?
is 1. For each of the following Boolean expressions, determine,           4, a) Find the fundamental conjunction made up from the
if possible, the value of the expression. If you cannot determine               variables w, x, y, Z, or their complements, where the value
the value of the expression, then find the number of assign-                    of the conjunction is 1 precisely when
                                                          15.2. Gating Networks: Minimal Sums of Products: Karnaugh Maps               719

i}   w=x=0,y=z=1.                                           11. Simplify the following Boolean expressions,
        i)     w=O0,x =1,y=1,z7=0.                                         a)xy+(x+y)zZ+y
       iii)    w=O,x =y=z=1.
       iv)     w=x=y=z=0.                                                 byx+y+@+y+z)
    b) Answer part (a) this time for fundamental disjunctions,
                                                                           c) yz twx+z2+{wz(xry + wz)]
    instead of fundamental conjunctions, where the value of           12. Find the values of the Boolean variables w, x, y, z that sat-
    each fundamental disjunction is 0 precisely for the stated        isfy the following system of simultaneous (Boolean) equations.
    values of w, x, y, Z.                                                   x+xy       =0      xy =Xz        Xy+XZ+7w
                                                                                                                    = zw
5. Suppose that f: B’ > B is defined by                              13. a) For
                                                                               f, g, 4: BY >        B, prove that fg + fh+gh       =
                    SX, YZ) = (+ y) + 2).                                 fg + fh and that fg + fe+ fet fg=1.
    a) Determine the d.n.f. and c.n-f. for f.                             b) State the dual of each result in part (a).

b) Write f as a sum of minterms and as a product of max-          14. Let f, g: B" ~ B. Define the relation “<” on F,,, the set of
    terms (utilizing binary labels).                                  all Boolean functions of » variables, by f < g if the value of ¢g
                                                                      is 1 at least whenever the value of f is 1.
6. Let g: B41 — B be defined by
                                                                           a) Prove that this relation is a partial order on F,.
               8(wW,X, y, 2) = (wz + XyZ)\(x + xyz).
                                                                          b) Prove that fg < fandf<ft+g.
    a) Find the d.n.f. and c.n.f. for g.
                                                                           c)   Forn   = 2, draw the Hasse diagram for the 16 functions
    b) Write g as a sum of minterms and as a product of max-              in F,. Where are the minterms and maxterms located in the
    terms (utilizing binary labels).                                      diagram? Compare this diagram with that for the power set
  7. Let Fg denote the set of all Boolean functions f: B° > B.            of {a, b, c, d} partially ordered under the subset relation.
(a) What is | ¥,|? (b) How many fundamental conjunctions (dis-        15, Define the closed binary operation @ (Exclusive Or) on F,,,
junctions) are there in F;,? (c) How many minterms (maxterms)
                                                                      the set of all Boolean functions on n variables, by f 6g =
are there in F,?                                                      fg+f92, where f, g: B” > B.
8. Let f: B* — B. Find the disjunctive normal form for f if              a) Determine f@ f, fOf, f Ol, and f GO.
    a) f-'(1) = {0101 (that is, w =0,x =1, y =0,z=1),                     b) Prove or disprove each of the following.
    0110, 1000, 1011}.
                                                                                  i) feg=-0>f=8
    b) f~'@)      = {0000, 0001, 0010, 0100, 1000, 1001, 0110}.
                                                                                 ii) fO(g@h)=(f Og) eh
  9. Let B” — B. If the d.n.f. of f has m fundamental conjunc-                  iii) fOg= fox
tions and its c.n.f. has k fundamental disjunctions, how are m,                 iv) fBgh=(f @g\(f Sh)
n, and & related?                                                                 v) f(g @h) = fe@ fh
10. Ifx, y, and z are Boolean    variables and x + y +z   = xyz,                vi)    (f@g)=fes=feg
prove that x, y, z all have the same value.                                     vii)   f@g=f@hsaegah

15.2
       Gating Networks: Minimal Sums
         of Products: Karnaugh Maps
                                The switching functions of Section 15.1 present an interesting mathematical theory. Their
                                importance lies in their implementation by means of logic gates (devices in a digital com-
                                puter that perform specified tasks in the processing of data). The electrical and mechanical
                                components of such gates depend on the state of the art; we shall not concern ourselves
                                here with questions relating to hardware.
                                    Figure 15.1 contains the logic gates for negation (complement), conjunction, and dis-
                                junction in parts (a), (b), and (c), respectively. Since the Boolean operations of + and - are
                                associative, we may have more than two inputs for an AND gate or an OR gate.
                                    Figure 15.2 shows the logic, or gating, network for the expression (w + x)(y + xz).
                                Symbols on a line to the left of a gate (or inverter) are inputs. When they are on line
720         Chapter 15   Boolean Algebra and Switching Functions

_        x —>                               x —>
                                              co] oi                                               ew                                          x+y
                                                                                      yY—>                               y —,
                                             (a) Inverter                         {b) AND gate                          (c) OR gate

Figure 15.1

as   So

ee                                 y+xz
                                                                                 XZ
                                                                                                   |}                            (w+ x)(y + xz)
                                                                         >       x
                                                      x                                                       —
                                                                                                       wt x
                                                 WO

Figure 15.2

segments to the right of a gate, they are outputs. We have split the input line for x, so that
                              x may serve as input for both an AND gate and an inverter.

The exercises will provide practice in drawing the logic network for a Boolean expression
                              and in going from the network to the expression. Meanwhile certain features of these
                              networks need to be emphasized.
                                      1) An input line may be split to provide that input to more than one gate.
                                      2) Input and output lines come together only at gates.
                                      3) There is no doubling back; that is, the output from a gate g cannot be used as an input
                                         for the same gate g or for any gate (directly or indirectly) leading into g.
                                      4) We assume that the output of a gating network is an instantaneous function of the
                                         present inputs. There is no time dependence and we attach no importance to prior
                                           inputs, as we do with finite state machines.

With these ideas in mind, let us analyze the computer addition of binary numbers.

When we add two bits (binary digits), the result consists of a sum s anda carry c. In three of
      EXAMPLE 15.10           four cases the carry is 0, so we shall concentrate on the computation of 1 + 1. Examining
                              parts (b) and (c) of Table 15.8, we consider the sum s and the carry c as Boolean functions
                              of the variables x and y. Thenc = xy ands =xy+xy =x @y = (x + yy). (Recall
                              that 6 denotes exclusive OR.)

Table 15.8
                                       x         y                  Binary Sum                x    y              Sum                 x    y         Carry

0         0                  0+0=0                     0    0               0                  0    0           0
                                       0          I                 O+1=1                     0    1                1                 0    1           0
                                       1         0                  1+0=1                      I   0                1                  1   0           0
                                       1         1                  1+1=10                     1   1               0                   1   1           1

(a)                                                     (b)                                     ()

Figure 15.3 is a gating network with two outputs. It is referred to as a multiple output
                              network. This device, called a half-adder, implements the results in parts (b) and (c) of
                                                              15.2 Gating Networks: Minimal Sums of Products: Karnaugh Maps                                   721

x—>                           x+y

T+ ste
                                                     yy —

“Dab
                                                     y—>
                                                                           >                           Xy
                                                                                                       C= xy

v
                                                     The half-adder
                                                     Figure 15.3

Table 15.8. Using two half-adders and an OR gate, we construct the full-adder shown in
                Fig. 15.4(a). Ifx = x,X,-1                            ...X2xX)X9 and y = yy V_—1...                     Y2¥1 yo, consider the process of
                adding the bits x; and y, in finding the sum x + y. Here c;_, is the carry from the addition
                of xj;-) and y;_, (and a possible carry c,_2). The input c;_;, together with the inputs x; and
                yi, produce the sum s; and the carry c; as shown in the figure. Finally, in Fig. 15.4(b) two
                full-adders and a half-adder are combined to produce the sum of the two binary numbers
                X2X1X9 and y2 yj yo, whose sum is C2525) 50.

S=S, OG               1

A [a —
                     Cc,        4   ——__—______>                       [>                                  Xy —>                                              So
                                                               H.A.                                                     H.A |
                                         si=x, @y,                        Cx, @ y)                         xX,                  >| F.A.
                                                                                                                                           Cy
                           x,       —>                                                                     V1         ————>               -—             t—   S>
                                             H.A.            C= X,Y,
                           Y;            >                                     >                  C=       x9                                     F.A.

XY,       +   Cx,   ® y,)         Y¥2.~-——                                      -—   C>

(a) The full-adder                                                                    (b)

Figure 15.4

The next example introduces the main theme of this section—the minimal-sum-of-
                products representation of a Boolean function.

Find a gating network for the Boolean function
EXAMPLE 15.11
                                                                   f(w. x,y, 2) = >> m(4, 5,7, 8,9, 11).

Consider the order of the variables                           as w, x, y, z. We               can determine           the d.n.f. of f
                by writing each minterm number in binary notation and then finding its corresponding
                fundamental conjunction. For example, (a) 5 = 0101, indicating the fundamental con-
                junction wx yz; and (b) 7 = O111, indicating wxyz. Continuing in this way, we have
                f (w, x, y, Z) = WXYZ + WxYZ + WXYZ + Wx VZ + WXYZ + WXYZ.
                    Using properties of Boolean variables, we find that

f =wxz¥t+y)+uxy(Z4+2)+                                   wxyz+ wxyz
                                                    =wWxz+wxy+            wxyz+ wxyz = wx(z + yz) + wx +                                        yz)
                                                    = wx(z + y) + wx(¥ +z) (Why?)                          = wx(yv +z) + wx(y +2),
                so
722   Chapter 15   Boolean Algebra and Switching Functions

a) f(w. x,y, 2) = WxZ + WxY+ wKY+ wXz; or
                          b) f(w, x,y,z) = Wx(y +z) + wx(+ y2).

In Example 15.11, the result

fw, x, y,Z) = WXZ + WXY + wXY + wxzZ

is often referred to as a minimal-sum-of-products representation for the function
                        f(w, x,y, 2= S- m(4, 5, 7, 8, 9, 11). We see that this representation is a sum of four
                        products — where each product is made up of three literals. When we call such a represen-
                        tation minimal we mean two things:
                            1) Any possible further modification will result in a representation that is not a sum of
                               such products; and
                           2) If f can be represented in a second way as a sum of products (of literals), then we
                              will have at least four product terms — each with at least three literals.
                              [Note: A minimal sum of products for a given Boolean function f (4 0) need not be
                               unique
                                   — as we shall find in Example        15.15.]

In this text our discussion of this idea will be somewhat informal. We shall not attempt to
                        prove that each nonzero Boolean function has such a minimal-sum-of-products representa-
                        tion. Instead we shall assume the existence of this representation and simply continue our
                        study of how to obtain such a result.
                            From this point on we shall consider an input of the form w as an exact input, which has
                        not passed through any gates, instead of regarding it as the result obtained from inputting
                        w and passing it through an inverter.
                            In Fig. 15.5(a), we have a gating network implementing the d.n.f. of the function f
                        in Example 15.11. Part (b) of the figure is the gating network for f as a minimal sum of
                        products. Figure 15.5(c) has a gating network for f = wx(yv +z) + wx(y 4+ 2z).
                            The network in part (c) has only four logic gates, whereas that in part (b) has five such
                        devices. Consequently, we may feel that the network in part (c) is better with regard to
                        minimizing cost because each extra gate increases the cost of production. However, even
                        though there are fewer inputs and fewer gates for the implementation in part (c), some of
                        the inputs (namely, y and z) must pass through three /evels of gating before providing the
                        output f. For the minimal sum of products in part (b), there are only two levels of gating. In
                        the study of gating networks, outputs are considered instantaneous functions of the input.
                        In practice, however, each level of gating adds a delay in the development of the function
                        f. For high-speed digital equipment we want to minimize delay, so we opt for more speed
                        at the price of increased manufacturing cost.
                            It is this need to maximize speed that makes us want to represent a Boolean function
                        as a minimal sum of products. In order to accomplish this for functions of not more than
                        six variables, we use a pictorial method called the Karnaugh map, developed in 1953 by
                        Maurice Karnaugh (1924 — ). Karnaugh maps always produce forms with at most two levels
                        of gating, and we shall find that the d.n.f. of a Boolean function is a major key behind this
                        technique.
                            In simplifying the d.n.f. of f in Example 15.11, we combined the two fundamen-
                        tal conjunctions wxyz and wxyz into the product term wxz because wxyz + Wxyz =
                        wxz(y + y) = wxz(1) = wxz. This indicates that if two fundamental conjunctions differ
                        in exactly one literal, then they can be combined into a product term with that literal missing.
                                                      15.2 Gating Networks: Minimal Sums of Products: Karnaugh Maps    723

Sl
                             ><
                              N
                           Wxyz

S|
                                                                          Nx
                           wXxyz                                              w
                                       La                                     x

y
                                    $n                                            7

wxyz | -————                fw, X, ¥, 2)           w
                                                                              x

y
                           WXYZ

x13
                                                                         N
a
                           Wxyz

(bd)

Fw, X, ¥, 2)

Level 1                  Level 2                   Level 3
          (C)

Figure 15.5

For g: B4 — B, where g(w, x, y, Z) = wxyz + wxyz + wxyz+ wxyz, each funda-
                            mental conjunction (except the first) differs from its predecessor in exactly one literal. Here
                            we can simplify g as g = wxy(Z +z) + wxy(z+Zz) = wxytuxy =wx(yty) = wx.
Table 15.9                  We could have also written

w\x           0      1                       8 = wx(VZtyet yet yz) = wx(y + y)(Z + Z) = wx.
  0                             The key to this reduction process is the recognition of pairs (quadruples, ... , 2”-tuples)
  l                    1     of fundamental conjunctions where any two adjacent terms differ in exactly one literal. If
                             h: B* —     B, and the d.n-f. of A has 12 terms, can we move these terms around to recognize
(a) wx
                             the best reductions? The Karnaugh map organizes these terms for us.

We start with the case of two variables, w and x. Table 15.9 shows the Karnaugh maps
                             for the functions f(w, x) = wx and g(w, x) = w + x. (The 0’s are suppressed in the tables
                             for these maps.)
                                In part (a), the 1 interior to the table indicates the fundamental conjunction wx. This
                             occurs in the row for w = 1 and the column for x = 1, the one case when wx = 1. In
724         Chapter 15 Boolean Algebra and Switching Functions

part (b), there are three 1’s in the table. The top 1 is for wx, which has the value 1 exactly
                             when   w = 0, x = 1. The bottom two               1’s are for wx and wx, as we read the bottom row
                             from left to right.
                                Table 15.9(b) represents the d.n.f. x + wx + wx. As a result of their adjacency in
                             the bottom row, the table indicates that wx and wx differ in only one literal and can.be
                             combined to yield w. By the idempotent law of addition (which is so crucial in working
                             with Karnaugh maps), we can use the same fundamental conjunction wx a second time
                             in this reduction process. The adjacency in the second column of the table indicates the
                             combining of wx and wx to get x. (In the x column all possibilities for w— namely,
                             w and w— appear. This is a way to recognize x as the result for that column.) Thus
                             Table 15.9(b) illustrates that wx + wx + wx = wx + wx + wx + wx = (wx + wx) t+
                             (wx + wx)     = wXeX+x)4+      W4+w)x             =wt+d)x=w+rx.

We now consider three Boolean variables w, x, y. In Table 15.10, the first new idea we
      EXAMPLE 15.12
                             encounter is in the column headings for x y. These are not the same as the headings we had
                             for the rows in the function tables. We see here, in going from left to right, that 00 differs
                             from 01 in exactly one place, 01 differs from 11 in exactly one place, 11 differs from 10 in
                             exactly one place, and, upon wrapping around, 10 differs from 00 in exactly one place.

Table 15.10

w\xy | 00 O01            11   10

If f(w, x,y) = >> m(O, 2, 4, 7), then because 0 = 000(@xXY), 2 = 010(Wxy), 4 =
                             100(wxy), and 7 = 111(wxy), we can represent these terms by placing 1’s as shown in
                             Table 15.10. The | for wx y is not adjacent to any other | in the table, so it is isolated; we
                             shall have wxy as one of the summands in the minimal sum of products representing f. The
                             1 for Wxy (at the right end of the first row) is not isolated, for once again we consider the
                             table as wrapping around, making this 1 adjacent to the 1 for wxy (at the left end of the first
                             row). These combine (under addition) to give us wxy + WXY                 = Wy(xX +X)   = wy)    =
                             wy. Finally, the 1’s in the column for x = y = 0 indicate a reduction of wxy + wxy to
                             (w+ w)xy = (1)xy = xy. Hence, as a minimal sum of products, f = wxy + wy +X.

From the respective parts of Table 15.11 we have
|     EXAMPLE 15.13
                               a) f(w,x, y) = >> m(0, 2, 4, 6) = © m0, 4) + Yo m(2, 6) = (XY + wxy) +
                                    (xy +wxy) =(W+w)xytWrw)xy = ()xyt+ xy H=xytxy =
                                    (x + x)y = (1)y = y, the only variable whose value does not change when the
                                    four terms designated by the 1’s are considered. [The value of y is Q here, so
                                    fw, x, y)=y.]
                               b) f(w, x,y) = ¥> m(O, 1, 2,3) = WxY + Wry + Wxy + Wry = WKY +x +
                                    xytxy)=wet+xy)Ot+y)                  =wi))         =w.
                                c) f(w, x, y) => md, 2, 3,5, 6,7) = © m1, 3, 5,7) +) m(2, 3, 6.7) =y tx.
                                                   15.2. Gating Networks: Minimal Sums of Products: Karnaugh Maps                          725

Table 15.11

w\xy]00        01    u    10])           w\xy|         00     01       uu          10]]          w\sy | 00   01   M10

ye                                          ||)                                                    con)
                (a)                                      (b)                                                     (c)

Advancing to four variables, we consider the following example.

Find a minimal-sum-of-products representation for the function
EXAMPLE 15.14
                                                       f(w,x,y, z=                S > m0,         1, 2, 3, 8, 9, 10).

The Karnaugh map for f in Table 15.12 combines the 1’s in the four (adjacent) corners to
                        give the term Wx yz+ wxyz + wxyzZ+ wxyz =xXz(wWy + Wy + wy t wy) = xz. The
                        four 1’s in the top row combine to give wx. (Using only the middle two 1’s, we do not
                        make use of all the available adjacencies and get the term wxz, which has one more literal
                        than wx.) Finally, the 1 in the row (w = 1, x = 0) and the column (y = 0, z = 1) can be
                        combined with the 1 on its left, and these can then be combined with the first two                                1’s in
                        the top row to give Wx yz + Wxyz+ wxyz+wxyz                                         =xXy. Hence, as a minimal sum of
                        products, f(w, x, y,Z) =xXZ+WxX+XYy.

Table 15.12

wx\yz | 00 Ol                       I          10

00
                                                                 01
                                                                  11
                                                                 10

The map for f(w, x, y, z) =              S-   m(9,    10, 11, 12, 13) appears in Table 15.13. The only
EXAMPLE 15.15            1 in the table that has not been combined with another term is adjacent to a 1 on its right (this
                         combination yields wxz) and to a 1 above it (this combination yields wyz). Consequently,
                         we can represent f as a minimal sum of products in two ways: wxy + wxy + wxz and
                         wxy + wxy + wyz. This type of representation, then, is not unique. However, we should

Table 15.13

wx \ yz           00       O1       ll         10

00
                                                                 01
                                                                 1]
                                                                  0                 (oO
726         Chapter 15 Boolean Algebra and Switching Functions

observe that the same number of product terms and the same total number of literals appear
                             in each case.

There is a right way and there is a wrong way to use a Karnaugh map.
      EXAMPLE   15.16            Let f(w, x, y,z) = >         m(3, 4,5, 7,9,   13, 14, 15). In Table        15.14(a)     we   combine     a
                             block of four 1’s into the term xz. But when we account for the other four 1’s, we do what
                             is shown in part (b). So the result in part (b) will yield f as a sum of four terms (each with
                             three literals), whereas the method suggested in part (a) adds the extra (unneeded) term xz.

Table 15.14

ux\yz | 00       01       10] [ wx\yz]        00       o1      11       10
                                                00                    1                00
                                                01        1                            01     D
                                                                                              aq
                                                1                         1             1                     <>
                                                10                1                    10
                                          (a)                                    (b)

The following suggestions on the use of Karnaugh maps are based on what we have done
                             so far. We state them now so that they may be used for larger maps.

1) Start by combining those terms in the table where there is at most one possibility for
                                    simplification.
                                 2) Check the four corners of a table. They may contain adjacent 1’s even though the 1’s
                                    appear isolated.
                                 3) In all simplifications, try to obtain the largest possible block of adjacent 1’s in order to
                                    get a minimal product term. (Recall that 1’s can be used more than once, if necessary,
                                    because of the idempotent law of +.)
                                 4) If there is a choice in simplifying an entry in the table, try to use adjacent 1’s that
                                    have not been used in any prior simplification.

EXAMPLE   15.17       If   f(v,w,    x,y, z=       > m(1, 5, 10, 11, 14, 15, 18, 26, 27, 30, 31),            we   construct   two
                            4 x 4 tables, one for v = 0, the other for v = 1. (See Table 15.15.)

Table 15.15

wx\yz | 00 01         1   10] | wx\yz]        00 01            1        10
                                                00                              00
                                                Ol                              01
                                                 i                              U                              coi
                                                10                              10                             Ld
                                           (v = 0)                               (v= 1)

Following the order of the variables, we write, for example, 5 = 00101 in order to indi-
                            cate the need for a | in the second row and second column of the table for v = 0. The other
                            five 1’s in the table where v = 0 are for the minterms for 1, 10, 11, 14, 15. The minterms for
                                                              15.2 Gating Networks: Minimal Sums of Products: Karnaugh Maps                727

18, 26, 27, 30, 31 are represented by the five 1’s in the table where v = 1. After filling in all
                                 the 1’s, we see that the | in the first row, fourth column of the table for v = 1 can be combined
                                 with another term in only one way — with vwx yz — yielding the product vx yz. This is also
                                 true for the two 1’s in the second column of the (v = Q) table. These give the product v w yz.
                                 The block of eight 1’s yields wy, and we have f(v, w, x, y,z) = wy + UWyz+ UXYZ.

A function f of the six variables t, v, w, x, y, and z requires four tables — one for each of
                                 the cases (a) f = 0, v = 0; (b) t = 0,v = 1; (c)t = 1,v = 1; and (d) t = 1,v = 0. Beyond
                                 six variables, this method becomes overly complicated. Another procedure, the Quine-
                                 McCluskey Method, can be used. For a large number of variables the method is tedious to
                                 perform by hand, but it is a systematic procedure suitable for computer implementation,
                                 particularly for computers possessing some type of “binary compare” command. (More
                                 about this technique is given in Chapter 7 of Reference [3].)

We close this section with an example involving the dual concept— namely, a minimal
                                 product of sums.

For g(w, x,y,z)       = I]    M(1,     5, 7, 9, 10, 13, 14, 15), this time we place a 0 in each of the
    EXAMPLE 15.18
                                 positions for the binary equivalents of the maxterms listed. This yields the results shown in
                                 Table 15.16 (where the 1’s are suppressed).

Table 15.16

|o ||
                                                                         wx \ yz        00   Ol   Il    10

00

11

The 0 in the lower right-hand corner can only be combined with the 0 above it, and
                                 so we have (W+x+Y¥+z(Wt+xX+yt+z=Wt+yt+z4+xx=(W+y+72)+0=
                                 w +y-+2z. The block of four 0’s (for the maxterms for 5, 7, 13, 15) simplifies to ¥ + Z,
                                 whereas the four 0’s (for the maxterms for 1, 5, 9, 13) in the second column yield y + z. So
                                 g(w,x, y,z) = (W+y¥+z@+7(y                           + Z), a minimal product of sums.

3. Answer Exercise 2, replacing NAND by NOR.
                       EXERCISES       15.2                                          .
                                                                               4. Using inverters, AND gates, and OR gates,            construct
                                                                              gating networks for
1. Using inverters, AND      gates, and OR    gates, construct the
gates shown in Fig. 15.6.                                                            a) f(y, 7) =xz+yzZ+x
2. Using only NAND*        gates (see Fig. 15.6), construct the in-
                                                                                     b) gQ,y, 2) = (+2) + 2)%
verter, AND gate, and OR gate.                                                       c) h(x, y, z) = (xy @ yz)

*The NAND gate is constructed in a very simple manner from transistors — both in the old-fashioned technol-
                                 ogy of semiconductors as well as in the more recent techniques of silicon chip fabrication, Furthermore, most of
                                 the gating networks that represent what is actually happening inside of today’s computers contain large numbers
                                 of these NAND gates.
728            Chapter 15 Boolean Algebra and Switching Functions

x

y   om    f(x,y) =x@Oy
                            EXCLUSIVE-OR gate

x   —>
                                    pe aten
                    yY —

y tl
                                                                                                        % Y
                               g(x,
                                 y) = xy
                               NAND   gate

xX —>
                                             Atx, y)
                     yy —>

A(x, y)=xt+y
                                NOR gate
                                                                      (b)
                 Figure 15.6
                                                                     Figure 15.8

d) f(w, x, y,z) = >, m6, 6, 8, 11, 12, 13, 14, 15)
5. For the network in Fig. 15.7, express f as a function of
w,Xx, y, Zz.                                                           e) f(w, x,y,z) = >. mC, 9, 10, 11, 14, 15)
                                                                        f) f(v, wx, y, 2) =
6. Implement the half-adder of Fig. 15.3 using only (a) NAND                  m(1, 2, 3, 4, 10, 17, 18, 19, 22, 23, 27, 28, 30, 31)
gates; (b) NOR gates.
                                                                    10. Obtain        a    minimal-product-of-sums    representation   for
7. For each of the networks in Fig. 15.8 express the output in     fiw, x,y,             =] ] MO, 1, 2, 4,5, 10, 12, 13, 14).
terms of the Boolean variables x, y or their complements. Then      11. Let f: B" > B be a function of the Boolean variables
use the expression for the output to simplify the given network.    X1,.X2,...,X,. Determine nv if the number of 1’s needed to
8. For each of the following Boolean functions f, design a         express x, in the Karnaugh map for f is (a) 2; (b) 4; (c) 8;
two-level gating network for f as a minimal sum of products.        (d) 2, fork € Zt with 1 <k<n—-1.
      a) f: B> — B, where f(x, y, z) = lifand only if exactly       12. If g: B’ > B is a Boolean function of the Boolean vari-
      two of the variables have the value 1.                        ables x1, X2, ..., ¥7, how many 1’s are needed in the Karnaugh
                                                                    map of g in order to represent the product term (a) x1; (b) x142;
      b) f: B4 — B, where f(w, x, y, z) = 1 if and only if an
                                                                    (C) X1 X23; (Gd) x1X3X5X7?
      odd number of variables have the value 1.
                                                                    13. In each of the following, f: B* > B, where the Boolean
9, Find a minimal-sum-of-products representation for
                                                                    variables (in order) are w, x, y, and z. Determine | f~'(0)| and
       a) f(w,x, y) = >> m(1, 2,5, 6)                               | f-'(1)| if, as a minimal sum of products, f reduces to
      b) fiw, x, y) =|] MO, 1,4, 5)                                         a)x                    b) wy              Cc) wyz

c) f(w, x,y,z) =)          m(0, 2, 5,7, 8, 10, 13, 15)               dj) x+y                e) xy +2           f) xyz+w

w—>
                        x —>
                                                             _
                        y                              ——>   |

|or
                                                                    ao,
                        Z
                                                                                      v

Figure 15.7
                                                            15.3 Further Applications: Don’t-Care Conditions   729

15.3
        Further Applications:
       Don’'t-Care Conditions
                  Our objective now is to use the ideas we have developed in the first two sections ina variety
                  of applications.

As head of the church bazaar, Paula has volunteered           to leave her automobile dealership
EXAMPLE 15.19
                  early one evening in order to bake a cake that will be sold at the bazaar. Members of the
                  bazaar committee volunteer to donate the needed ingredients as shown in Table 15.17.

Table 15.17

Flour | Milk | Butter | Pecans | Eggs

Sue          x                    x

Dorothy                              x           x

Bettie      x           x

Theresa                 x                               x

Ruthanne                   x         x          x

Paula sends her daughter Amy to pick up the ingredients. Write a Boolean expression to
                 help Paula determine which (minimal) sets of volunteers she should consider so that Amy
                 can collect all of the necessary ingredients.
                     Let s, d, b, t, and r denote five Boolean variables corresponding, respectively, to the
                 women listed in the first column of the table. To get the flour, Amy must visit Sue or Bettie.
                 In Boolean terminology, we can say that flour determines the sum s + b. This term will be
                 part of a product of sums. For the other ingredients, the following sums denote the choices.

milk:b+t+r            butter:s+d+r                pecans:d +r          eggs: t
                    To answer the question posed here, we seek a minimal sum of products for the function
                 f(s,d,b,t,r) =(s+b)\(b+t+r)(s+d+r)(d +r)t. The answer can be obtained by
                 multiplying everything out and then simplifying the result, or by using a Karnaugh map.
                 This time we’ll use the map (in Table 15.18).

Table 15.18
                                db\rr|00 o1             1    10] [ ab\ir]          oo or 1            10
                                0       |0 0            0    0}    jo               |oo1              0
                               01       0 o             1    0|    Jor              0 0 1             0
                                1       oo              1     4]    fu              0 0 1              1
                                10      |0 0            0    0|     |/i0o          |o 0 1              1
                              (s = 0)                                (s = 1)

We are starting with f as a product (not minimal) of sums. Consequently, we first fill in
                 the 0’s of the table as follows: Here s + b, for example, is represented by the eight 0’s in
                 the first and fourth rows of the table for s = 0— these are the eight assignments for s, d, b,
730          Chapter 15 Boolean Algebra and Switching Functions

t, r where s + b has the value 0; for t we need the 16 0’s in the first two columns of both
                              tables. After filling in the 0’s for the other three sums in the product, we then place a 1 in
                              the nine remaining spaces and arrive at the table shown. Now we need a minimal sum of
                              products for the nine 1’s in the table. We find the result is srt + sdt + brt + dbt. (Verify
                              this.) Therefore, Amy can be sent to collect the ingredients in one of four ways. She may
                              call upon Sue, Ruthanne, and Theresa — or, perhaps, Dorothy, Bettie, and Theresa — or she
                              may follow through with one of her other two options.

In our next application, we examine a certain property of graphs. This property was
                              introduced earlier in Supplementary Exercise 10 of Chapter 11. The development here,
                              however, does not rely on that prior presentation.

Definition 15.4         Let G = (V, E) denote a graph (undirected) with vertex set V and edge set FE. A subset D
                              of V is called a dominating set for G if for every v € V, either v € D or v is adjacent toa
                              vertex in D.

For the graph shown in Fig. 15.9, the sets {a, d}, {a, c, e} and {b, d, e, f} are examples
                              of dominating sets. The set {a, c, e} is a minimal dominating set, for if any of the three
                              vertices a, c, or e is removed,      the remaining      two no longer dominate    the graph. The set
                              {a, d} is also minimal, but {b, d, e, f} 1s not because {b, d, e} already dominates G.

b               C

f

Figure 15.9

For the graph shown in Fig. 15.9, let the vertices represent cities and the edges highways.
      EXAMPLE 15.20
                              We wish to build hospitals in some of these cities so that each city either has a hospital or
                              is adjacent to a city that does. In how many ways can this be accomplished by building a
                              minimal number of hospitals in each case?
                                 To answer this question, we need the minimal dominating sets for G. Consider vertex
                              a. To guarantee that a will satisfy our objective, we must build a hospital in a, or b, or d,
                              or f (since b, d, and f are all adjacent to a). Hence we have the terma +b+d-+        f. For
                              b to satisfy our objective, we generate the term a + b + c + d. Continuing with the other
                              four locations, we find that the answer is then a minimal-sum-of-products representation for
                              the Boolean function g(a, b,c, d,e, f) =(at+bt+d+4+                  fi\(a@t+b+c4+d)(b+c4+d)-
                              (at+b+c+d+ej\(d+e+                  f)(a+e-+         f). Using the properties of Boolean variables,
                              we have

g=(atb+d+fyb+c+d):                                      Absorption Law
                                    (d+et+fiatert f)
                                  =[(at+ fie+ (b+ d)ida+(e+ f)]                           Distributive Law of + over -,
                                                                                              and the Commutative Law of +
                                                        15.3 Further Applications: Don’t-Care Conditions       731

= {act foet+b+dlidat+et f]                             Distributive Law of - over +
                     =acda+ace+acf+ feda+ fee+ fef                          Distributive Law of « over +
                       + bda
                           + be + bf +dda+de+df
                     = ace + (acf + acdf + cef +cf)                         Commutative and Associative
                       + (acd + abd + ad) + be + bf                           Laws of + and -, and the
                       +de+df                                                 Idempotent Law of -
                    = ace +cf +ad
                                + be +bf +de+df                             Absorption Law -

Consequently, in six of the cases the objective can be achieved by building only two
                hospitals. If a and c have the largest populations and we want to locate hospitals in each of
                these cities, then we would also have to construct a hospital at e.

The next application we shall examine introduces the notion of “don’t-care” conditions.

The four input lines for the gating network shown in Fig. 15.10 provide the binary equiva-
EXAMPLE 15.21
                lents of the digits 0, 1, 2, ..., 9, with each number represented as abce (e is least signifi-
                cant). Construct a gating network with two levels of gating such that the output function f
                equals | for the input that represents the digits 0, 3, 6, 9 (that is, f detects digits divisible
                by 3).
                                                           Table 15.19

a|b|{cle|f                     a|bicle|f
                                                              0;0/0/0/                 1     1/olololo
                                                              0/0/0/1/]            0         1/o0!ol/]1]41
                                                              01;0/;]1j,0)] 0                1;0/;/11]0]       x
                                                              O;O0;1)]1 4] 1                 1/o0/1!11!1*~x
                                                              0;    1;0);     0]       0     1;/1/0/0|]        x
                5.        Multiple
                <__,| of three L-—» f                         O;}   1/0}       110           1}/1/0/1]x
                e ——»|     detector                           O;1)1/0)                 1     1/1/:1/0!1~x

Figure 15.10                                  O;1)1}1)]0                      1}   i1}1/1)*x

Before concluding that f = 0 for the other 12 cases, we examine Table 15.19, where an
                “” appears for the value of f in the last six cases. These input combinations do not occur
                (because of certain external constraints), so we don’t care what the value of f is in these sit-
                uations. For such occurrences, the outputs are referred to as unspecified and f is called
                incompletely specified. Therefore, we write f = >> m(0, 3, 6, 9) + d(10, 11, 12, 13,
                14, 15), where d(10,   11, 12, 13, 14, 15) denotes the six don’t-care conditions for the rows
                with the binary labels for 10, 11, 12, 13, 14, 15. When seeking a minimal-sum-of-products
                representation for f, we can use any or all of these don’t-care conditions in the simplification
                process.
                    From the Karnaugh map in Table 15.20, we write f as a minimal sum of products,
                obtaining

f =abc@ + bce + bce +e.
                The first summand in f is for recognition of 0; bce provides recognition for 3 because
                it stands for 0011 (abce), since 1011 (abce) does not occur. Likewise bc@ is needed to
                recognize 6, whereas ae takes care of 9. Figure 15.11 provides the interior details (minus
732         Chapter 15 Boolean Algebra and Switching Functions

the inverters) of Fig. 15.10. (Note that in Table 15.20 there are some don’t-care conditions
                             that were not used.)

YI
                            DAH!
                            nani

Table 15.20
                            &

ab \ ce   00      O1    11   10
                            DBO

00        @            (1)
                                                                                                    01                     : ()
                            %

11       x

Figure 15.11                                                            10             a            x

We close this section with one more example on how to use don’t-care conditions.

Find a minimal-sum-of-products representation for the incompletely                   specified Boolean
      EXAMPLE 15.22
                             function

f(w, x,y,z) = >" m(, 1, 2, 8, 15) +d(9, 11, 12).

Consider the Karnaugh map in Table 15.21. As in the previous examples each minterm
                             is represented by a 1 in the table; each don’t-care condition is designated by an xX. The
                              1 representing wxyz (at the right end of the first row) can be simplified in only one
                             way  — using the “adjacent” 1 for wx yz. This gives us xyz + wxyz=wWxzyt+y)=
                             wx z. Likewise the 1 for the fundamental conjunction wxyz is only adjacent to an X —
                             for the don’t-care condition wx yz. This adjacency simplifies to wxyz + wX yz = wyz.
                             Finally, the remaining 1’s for the fundamental conjunctions wx yz and wx yz can be
                             used with the minterm for 0— namely, wx yz—and the don’t-care condition wx yz.
                             This gives us Wx yz + wWxYZ+ WXYZ+ wryz = (Wz+wz7+ W724 wz)xy =
                             (w+ WZ+Z)xXV = XY.

Table 15.21

wx \yz | 00 O01 IL 10
                                                                   00          QD             MD)
                                                                   01
                                                                   0
                                                                   11          x

[Note the following:

1) In the third simplification we used the fundamental conjunction wx yz a second
                                      time. It was also used in the first simplification since it is adjacent to the fundamental
                                      conjunction wx yz. However, this does not present a problem here because of the
                                      idempotent law of +.
                                                                                   15.4 The Structure of a Boolean Algebra (Optional)                           733

2) The don’t-care condition for wx yz was not used.|

Consequently, as a minimal sum of products, f(w, x, y, z) =                                 -     m(Q,    1, 2, 8, 15) +
                                    d(9, 11, 12) = > m(O, 1, 2, 8, 15) + d(9, 11) = WXZ+ wyz+ XV.

c) fv, w, x,y,z) =
                          eae                                                   S> m(0, 2, 3, 4,5, 6, 12, 19, 20, 24, 28) + d(1, 13, 16, 29, 31)
                                                                                 4. The four input lines for the gating                         network    shown   in
1. For his tenth birthday, Mona wants to buy her son Jason
                                                                                 Fig.     15.12   provide    the binary          equivalents         of the numbers
some stamps for his collection. At the hobby shop she finds
                                                                                 0,1, 2,...,       15,   where       each    number      is represented     as abce,
six different packages (which we shall call u, v, w, x, y, z).
                                                                                 with e the least significant bit.
The kinds of stamps in each of these packages are shown in
Table 15.22.                                                                        a) Determine the d.n.f. of f, whose                       value is 1 for abce
    Determine all minimal combinations of packages Mona can                         prime, and 0 otherwise.
buy so that Jason will get some stamps from all four geograph-                          b) Draw the two-level gating network for f as a minimal
ical locations.                                                                         sum of products.
  Table 15.22                                                                           ¢) We are informed that the given network is part of a larger
                                                                                        network and that, as a result, the binary equivalents of the
           United States | European | Asian | African                                   numbers 10 through 15 are never provided as input. Design
                                                                                        a two-level gating network for f under these circumstances.
    u                                J                          v
    v             v                                 v
                                                                                                                           Prime
    w             v                  v                                                                   b
                                                                                                         esl         |
                                                                                                                          number
                                                                                                                         d detector
                                                                                                                                        |-——»    f

x             J                                                                                      Figure 15.12
    y             J                                             Jv
                                                                                 5. Determine all minimal                   dominating       sets for the graph    G
    z                                               J           J                shown in Fig. 15.13.

2. Rework       Example   15.20    using   a Karnaugh     map        on   six
variables.
                                                                                                          mee
3. Find a minimal-sum-of-products representation for
   a) f(w, x,y,z) =)          mC, 3,5,7,9) +
                                            d(10,   11, 12, 13, 14, 15)                                          f                       g
   b) f(w, x, y,z) = >        m(O, 5, 6, 8, 13, 14) + d4, 9, 11)                                          Figure 15.13

15.4
             The Structure of a Boolean
                      Algebra (Optional)
                                    In this last section we analyze the structure of a Boolean algebra and determine those
                                    m € Z* for which there is a Boolean algebra of m elements.

Definition 15.5             Let & be a nonempty set that contains two special elements 0 (the zero element) and 1 (the
                                    unity, or one, element) and on which we define closed binary operations +, -, and a monary
                                    (or unary) operation ~. Then (B, +, -, ~, 0, 1) is called a Boolean algebra if the following
                                    conditions are satisfied for all x, y, z € B.
734         Chapter 15 Boolean Algebra and Switching Functions

a)     x+y=ytx                    a)    xy=yx                                Commutative Laws
                                b)     xQv+z) =xy4+xz             by    x+yz=(*+         y)(x +2)            Distributive Laws
                                c)     x+0=x                      cy    xl=x-l=x                             Identity Laws
                                d)     x+x=1                      dy    xx =x-x=0                            Inverse Laws
                                e)     OF 1

As seen in Definition 15.5, we often write x y for x - y. When the operations and identity
                             elements are known, we write & instead of (8, +, -, ~, 0, 1).
                                From our past experience we have the following examples.

Tf U is a (finite) set, then B= PCU) is a Boolean algebra where for A, B CU, we have
      EXAMPLE 15.23
                             A+B=AUB,AB=AQNB, A =the complement of A (in U), and where 9 is the zero
                             element and “Ul is the unity.

Forn € Z*, F, = {f: B" — B}, the set of Boolean functions on n Boolean variables, is a
      EXAMPLE 15.24
                             Boolean algebra where +, -, and ~~ are as defined in Definition 15.2, and where the zero
                             element is the constant function 0, while the constant function 1 is the one element.

Let us now examine a new type of Boolean algebra.

Let & be the set of all positive integer divisors of 30: B = {1, 2, 3, 5, 6, 10, 15, 30}. For
      EXAMPLE 15.25
                             all x, y € B, define x + y = Iem(x, y); xy = ged(x, y); and x = 30/x. Then with 1 as
                             the zero element and 30 as the unity element, one can verify that (B,+,-+,~, 1, 30) isa
                             Boolean algebra. We shall establish one of the distributive laws for this Boolean algebra
                             and leave the other conditions for the reader to check.
                                 For the first distributive law we want to show that

gcd(x, Ilem(y, z)) = lem(ged(x, y), ged(x, z)),

for all x, y, z € B. In order to do so we write
                                             x=   Di gk2 5h        y=      Qi gma sms      and   z=    QBN       SMa

where 0 < k;, m;,n;     < 1 forall   1 <i <3.
                                Then     Icm(y, z) = 2°'3°5°,      where      s; = max{m;,n;},   for   all     1<i<3,        and   so
                             gcd(x, lem(y, z)) = 2"375°, where ¢; = min{k;, max{m;,n;}}, for all 1 <i <3. Also,
                             ged(x, y) = 2%'325’8, where u; = min{k;, m;}, when 1 <i <3,and gcd(x, z) = 2"!3%5"%
                             with v; = min{k;, 2;} for all | <i <3. So lem(ged(x, y), gcd(x, z)) = 2%'3"5", where
                             w; = max{u;, v;}, for all 1 <i <3.
                                Therefore,   for each    7 € {1, 2, 3}, w; = max{u;, v,} = max{min{k;, m;}, min{k;, n;}},
                             and ft, = min{k,, max{m,, 1,}}. To verify the result, we need to show that w; = ¢; for all
                             1<i<3.Ifk; =0,thenw; = 0 =1¢,.Ifk; = 1, then w; = max{m,;, n;} = t;. This exhausts
                             all possibilities, so w, = ft; for 1 <i <3 and

gcd(x, Iem(y, z)) = Iem(ged(x,       y), gcd(x, z)).

If we analyze this result further, we find that 30 can be replaced by any number m =
                             P1p2p3, where p,, p2, p3 are distinct primes. In fact, the result follows for the set of all
                             divisors of p1 p2-- + Pn», a product of n distinct primes. (Note that such a product is square-
                             free; that is, there is no k € Z*, k > 1, with k* dividing it.)
                                                         15.4 The Structure of a Boolean Algebra ,Optional)   735

A word about the propositional calculus. If p, g are two primitive propositions, we may
  EXAMPLE 15.26
                  fee] that the collection of all propositions obtained from p, g, using V, A, and ~, should be
                  a Boolean algebra. After all, just look at the laws of logic and the way they compare with
                  the comparable results for set theory and Boolean functions. There is one main difference.
                  In our study of logic we found, for example, that p A g <> q A p, not that pA g =g A p.
                  To get around this we define a relation ‘8 on the set S of all propositions so obtained from
                  p,q, where 5; KR sz if s; <> sy. Then & is an equivalence relation on S and partitions S, in
                  this case, into 16 equivalence classes. If we define +, -, and   on these equivalence classes
                  by [s,] + [s2] = [81 V sa], [51 ][s2] = [s) A so], and [s,] = [—s,], and if we recognize    [Ty]
                  as the one element and [ Fy] as the zero element, then we get a Boolean algebra.

In the definition of a Boolean algebra, there are nine conditions. Yet in the lists of
                  properties we examined for set theory, logic, and Boolean functions, we listed 19 properties.
                  And there were even more! Undoubtedly, there is a way to get the remaining properties,
                  and others not listed among the 19, from the ones given in the definition.

THEOREM 15.1      The Idempotent Laws. For all x € %, a Boolean algebra, (i) x + x = x; and (ii) xx = x.
                  Proof: (To the right of each equality appearing in this proof, we list the letter of the condi-
                  tion from Definition 15.5 that justifies it.)

i) x =x+0                     c)                   ii) x =x-1                 c)’
                          =x+xx                    dy                         =x(x +x)             d)
                          = (x + x)(x +X)          by’                        = XX + XX            b)
                          =(x+x)-1                 d)                         =xx+0                dy’
                          =x+x                     cy                         = xx                 c)

In proving this theorem we can obtain the proof of part (ii) from that of part (i) by
                  changing all occurrences of + to -, and vice versa, and all occurrences of 0 to 1, and vice
                  versa. Also, the justifications for the corresponding steps constitute a pair of conditions in
                  Definition 15.5. As in the past, these pairs are said to be duals of each other; condition (e)
                  is called self-dual. This now leads us to the following result.

THEOREM 15.2      The Principle of Duality. If s is a theorem about a Boolean algebra, and s can be proved
                  from the conditions in Definition 15.5 and properties derived from these same conditions,
                  then its dual s@ is likewise a theorem.
                  Proof: Let s be such a theorem. Dualizing all the steps and reasons in the proof of s (as in
                  the proof of Theorem 15.1), we obtain a proof for s“.

We now list some further properties for a Boolean algebra. We shall prove some of these
                  properties and leave the remaining proofs for the reader.

THEOREM 15.3      For every Boolean algebra %, if x, y, z € B, then
                    a)   x -0=0               ay   x+1=1                Dominance Laws
                    b)   xa+y)=x              by   x+xy=x               Absorption Laws
736   Chapter 15 Boolean Algebra and Switching Functions

Cc)       [xy =xzandxy=xz]>y=2z                                          Cancellation Laws
                          cy’       [Ix t+ty=x+zand¥+y=x4+z])>
                                                            y=z
                          d)        x(yz) = (xy)z  qd’ x+(yt+z=(+y)+2z                             Associative Laws
                          e)        [x+y=landxy=0) > y=x                                           Uniqueness of Complements
                                                                                                      (Inverses)
                          f)        X =X                                                           Law of the Double
                                                                                                     Complement
                          g) xy=x+y                           gy    xty=xy                         DeMorgan’s Laws
                          h) 0=1                              hy 1=0
                          i) xy =Oifandonlyif                 i))   x+y   =1 if and only if
                                    Xy=x                            xXx+y=X

Proof:

a) x-0=04x-0,                         by Definition 15.5(c), (a)
                                           =x:-x¥+x-0,          by Definition 15.5(d)’
                                           =x-(x       +0),     by Definition 15.5(b)
                                             =X-X,              by Definition 15.5(c)
                                             = 0,               by Definition 15.5(d)’
                           a)’ This follows from part (a) by the Principle of Duality.
                           c) Here           y=1-y=(x+xX)y          =xytxy     =xztxz=(x+x)2           =1-2=2.           (Verify
                                    all equalities.)
                           cy This is the dual of part (c).
                           d)       To establish this result, we use result (c)’ and arrive at the conclusion by showing
                                    that x + [x(yz)] =x + [(xy)z] and X¥ + [x(yz)] = xX + [(xy)z]. Using the absorp-
                                    tion law, we find that x + [x(yz)] = x. Likewisex + [(xy)z] = [x + (xy)](x +2) =
                                    x(x +z) =x. Then ¥ + [x(yz)] = @+x)@4+ yz)=1->@+ yz) =x 4+ yz,
                                    whereas x + [(xy)z] = (X + xy)\@+z) = (H+AXX + y)H4+2) =
                                    (1:     + y)@4+2)=@+y)\% +z) =X +4 yz. (Verify all equalities.)
                                        The result now follows by the cancellation law in part (c)’.
                           d) Fortunately, this is the dual of part (d).
                                ~

e) We find here that®¥ =X +0=X+xy=(¥+x)¥+y)=1-@+4+y)=
                                    X+y)-l=(®#+yet+y)=xXx+y =04+ y = y. (Verify all equalities.)
                                       We note that statement (e) is self-dual. Statement (f) is a corollary of (e) because
                                    x and x are both complements (inverses) of xX.
                           g) This result will follow from part (e) if we can show that x + y is a complement
                                    of xy.

xy+(%+y)=(Cytx)+y=%4+xX)(yt+x)t+y
                                                                    =1-(yt¥)+
                                                                        Y= Fy) FF                              =14+F51
                                       Also, xy(¥ + ¥) = (vyx) + (4yy) = (@X)y) + @(Y)) = O-y tx -0=
                                    0+0=0.
                                        Consequently, x + y is a complement of xy, and by uniqueness of complements,
                                    it follows that xy = x + y.

Enough proving for a while! Now we are going to investigate how to impose an order on
                       the elements of a Boolean algebra. In fact, we shall want a partial order, and for this reason
                        we turn now to the Hasse diagram.
                                                                        15.4 The Structure of a Boolean Algebra (Optional)   737

Let us start by considering the Hasse diagrams for the following two Boolean algebras.
         {1,2,3} =U
                 {1,3}            a) (PU), ULM, 7, 8, WU), where U = {1, 2, 3}, and the partial order is induced by the
{1,2}               {2, 3}          subset relation.
                                  b)   (Ff, +, -.,    1, 30), where   & = {1, 2, 3,5, 6, 10, 15, 30}, x + y = Iem(x, y), xy =
                                       gced(x, y), and x = 30/x. Hence the zero element is the divisor 1 and the one element
  {1}                     {3}
                                       is the divisor 30. The relation R on F, defined by x R y if x divides y, makes ¥ into
                                       a poset.
(a)        {   }=G
                                    Figure 15.14 shows the Hasse diagrams for these two Boolean algebras. Ignoring the
               30
                                labels at the vertices in each diagram, we see that the underlying structures are the same.
                                This suggests how we should define the concept of isomorphism for Boolean algebras.

KAY                        These examples also suggest two other ideas.

“                 :         1) ‘Can we partially order any finite Boolean algebra?
                                   2) Looking at Fig. 15.14(a), we see that the nonzero elements just above 4 are such
                                      that every element other than 4 can be obtained as a Boolean sum of these three. For
(b)                                   example, {1, 3} = {1} U {3} and {1, 2, 3} = {1} U {2} U {3}. For part (b), the numbers
Figure 15.14                          2, 3, and 5 are such that every divisor other than 1 is realized as the Boolean sum of
                                       these three. For example, 6 = Icm(2, 3) and 30 = Iem(2, Iem(3, 5)) = Iem(2, 3, 5).

We now start to deal formally with these suggestions.
                                   When dealing with sets in Chapter 3 we related the operations of U, , and ~ to the sub-
                                set relation by the equivalence of the statements: (a) A C Bs (b)AN B=A;(c)AUB=B;
                                and (d) B C A, where A, B © Ul. We now use parts (a) and (b) in an attempt to partially
                                order any Boolean algebra %.

Definition 15.6         Ifx, y € RB, definex < yifxy =x.

Hence we define a new concept — namely, “<” — in terms of notions we have in & —
                                namely, - and the notion of equality. We can make up definitions! But does this one lead us
                                anywhere?

THEOREM 15.4                    The relation “<”, just defined, is a partial order.
                                Proof: Since xx = x for all x € &%, we have x < x and the relation is reflexive. To establish
                                antisymmetry, suppose that x, y € B with x < y and y < x. Then xy = x and yx = y. By
                                the commutative property, xy = yx,sox = y. Finally, ifx < y and y < z, thenxy = x and
                                yZ = y, sox       = xy = x(yz)   = (wy)z = xz, and with x = xz, we have x < z, so the relation
                                is transitive.

Now we can partially order any Boolean algebra, and we note that for all x in a Boolean
                                algebra, 0 < x and x < 1. (Why?) Before going on, however, let us consider the Boolean
                                algebra consisting of the divisors of 30. How do we apply Theorem 15.4 in this example?
                                Here the partial order is given by x < yifxy = x. Since xy is gcd(x, y), if ged(x, y) = x,
                                then x divides y. But this was precisely the partial order we had on this Boolean algebra
                                when we started.
738          Chapter 15 Boolean Algebra and Switching Functions

Armed with this concept of partial order, we return to the observations we made earlier
                              about the elements in the Hasse diagrams of Fig. 15.14.

Definition 15.7         Let 0 denote the zero element of a Boolean algebra &%. An element x € %, x # 0, is called
                              an atom of & if for all ye BZ, y< x >y=Oory =x.

a) For the Boolean algebra of all subsets of U = {1, 2, 3}, the atoms are {1}, {2}, and {3}.
      EXAMPLE 15.27
                                b) When we are dealing with the positive integer divisors of 30, the atoms of this Bool-
                                   ean algebra are 2, 3, and 5.
                                c) The atoms in the Boolean algebra F,, = {f: B” —         B} are the minterms (or fundamen-
                                       tal conjunctions).

The atoms of a finite Boolean algebra satisfy the following properties.

THEOREM 15.5                    a) Ifx is an atom in a Boolean algebra &, then for all y € B, xy =Oorxy            =x.
                                b) If x;, x2 are atoms of & and x, F x2, then x; x2 = 0.
                              Proof:

a) For all x, y € B, xy <x, because (xy)x = x(yx) = x(xy) = (xx)y = xy. For x an
                                   atom, xy <x => xy =Oorxy =x.
                                b) This follows from part (a). The reader should supply the details.

THEOREM 15.6                  If x), X2,..., X, are all the atoms in a finite Boolean algebra &B and x € B with xx; =0
                              for all 1 <i <n, thenx        = 0.
                              Proof: Ifx # 0, let S = {y © Bl|O< y <x}. (0 < y denotes 0 < y andO ¥ y.) Withx € S,
                              we have S # @. Since S is finite, we can find an element z in %& where 0 < z <x and no
                              element of % is between 0 and z. Then z is an atom and 0 = xz = z > 0. This possibility
                              has led us to a contradiction, so we cannot have x # 0; that is, x = 0.

This leads us to the following result on representation.

THEOREM 15.7                  Given a finite Boolean algebra &% with atoms x1, x2, ..., X,, each x € RB, x #0, can be
                              written as a sum of atoms uniquely, up to order.
                              Proof: Since x # 0, by Theorem 15.6, S = {x;|xx; 4 0} AY. Let S = (4;,, %,,..., Xi},
                              and y = x;, + xj, +---+4;,. Then xy = x(xj, +24), t+ > $4i,) = 4X, HX,         + +
                              XXi, = Xi, + Xi. +--+ + Xi,, by Theorem 15.5(a). So xy = y.
                                  Now consider (xy)x; for each | <i <n. If x; ¢ S, then xx; = 0, and (xy)x; = 0. For
                              x; ES,      we   have    (xy)xj; = xx; (Xj, + Xi, $+   +i)   = XXX,        + Xi)   = XOX),
                              where      z is the product   of the complements   of all elements   in S — {x;}. As x;x, =0, it
                              follows that (xy)x; = 0. So (xy)x; = 0 for all x;, where | <i <n. By Theorem 15.6,
                              we have xy = 0.
                                 With xy = y and xy = 0, it follows thatx =x-1l=x(y+y)=xy+xy =xy+0=
                              y=XxXj, +X,       +++:   +X;,,a Sum of atoms.
                                                                15.4 The Structure of a Boolean Algebra (Optional)        739

To   show       that this representation for x is unique,          up to order,   suppose   x = x;, +
                      Xj bor +Xjp.
                          If x;, does not appear as a summand in x;, +x, +:+-+%;,, then x; =x), x, =
                      Xj, (Xj, +X, +--+ +x,,) [by Theorem 15.5(b)] = xj,.x = x), (4i, $4), +:+° + 4,) =O
                      [again, by Theorem 15.5(b)]. Hence x ;, must appear as a summand in x;, + x;, ++ ++ +24,
                      as must x;,,..., X;,.So0& <k. By the same reasoning, we get k < ¢ and find the represen-
                      tations identical, except for order.

From this result we see that if & is a finite Boolean algebra with atoms x,, x2, ..., Xn,
                      then each x € & can be uniquely written as }))_, c;x;, with each c; € {0, 1} (and because B
                      is closed under +, each such linear combination of atoms is in %). If c; = 0, this indicates
                      that x; is not in the representation of x; c; = 1 indicates that it is. Consequently, each x € B
                      is associated with a unique n-tuple (c), c2,..., C,), and there are 2” such n-tuples. There-
                      fore we have proved the following result.

THEOREM 15.8          If & is a finite Boolean algebra with n atoms, then |{B| = 2”.

There is one final question to resolve. If n € Z*, how many different Boolean algebras
                      of size 2” are there? Looking at the Hasse diagrams in Fig. 15.14, we see two different
                      pictures. But if we ignore the labels on the vertices, the underlying structures emerge as
                      exactly the same. Hence these two Boolean algebras are said to be abstractly identical or
                      isomorphic.

Definition 15.8    Suppose (B,, +, -, ~, 0, 1) and (B2, +, -, , 0, 1) are Boolean algebras. Then %, &B> are
                      called isomorphic if there is a function f: 2%, — &> such that f is one-to-one and onto,
                      and for all x;, yi € By,

a) fixity) = fand+ fon
                               t          t
                                {in 23,)             {in B2)

b) f(4i- yi) = f(x): fo)
                                t           t
                               (in By)            {in 22)

c) f(x) = f(x) [In f ,) we take the complement in %,, while for f (x,) the comple-
                           ment is taken in B>.]

Such a function f preserves the operations of the algebraic structures.

EXAMPLE     15.28   For the two Boolean algebras in Fig. 15.14, define f by

f:@->1                 f: 2}   3       f: {1,2} > 6             f: {2,3}
                                                                                                        15
                                 f:{l}>2                f:{3}—-5        f:{1, 3} >     10        f:{1,
                                                                                                   2, 3} > 30

Note the following:
                        a) The zero elements correspond under f, as do the unity elements.
                        b) FALE U {2)) = FCI, 2b) = 6 = Iem(2, 3) = Iem(
                                                                      fF ({1}), F2))
740      Chapter 15   Boolean Algebra and Switching Functions

ce) f({1, 2} {2, 3}) = F({2}) = 3 = ged(6, 15) = ged(
                                                                                Ff ({1, 2}), F({2, 3}))
                             d) f({2}) = f({1, 3}) = 10 = 30/3 = 3 = F({2})
                             e) The image of each atom ({1}, {2}, {3}) is an atom (2, 3, 5, respectively).

This function is an isomorphism. Once we establish a correspondence between the re-
                           spective zero elements and between the respective atoms, the remaining correspondences
                           are determined from these by Theorem 15.7 and the preservation of the operations under f.

From this example we have our final result.

THEOREM 15.9               Every finite Boolean algebra & is isomorphic to a Boolean algebra of sets.
                           Proof: Since & is finite, & has n atoms x;, 1 <i <n, and |B] = 2”. Let U = {1, 2,..., 7}
                           and 9 (°tL) be the Boolean algebra of subsets of U.
                               We define f: B ~ PAU) as follows. For each x € &, it follows from Theorem 15.7 that
                           we can write x = yr             c;X;, where each c; is 0 or 1. [Here c; € {0, 1} (= B) and for each
                           atom a in &%, c;a = 0 (the zero element in &) if c, = 0, while c;a = a when c, = 1.] Then
                           we define

f(x) ={ij/l<i<n                         and           c, =1}.

[For   example,    (1)    f(0)=%;               @)        f(«%;) = {i}         for    each         atom    x;,   where   1 <i <n;
                           (3) f(x; + x2) = {1, 2}; and (4) f(x2 + x4 +.x7) = {2, 4, 7}.] Now consider x, y € B,
                           with x =    rey c)x; and y = yore d;x,, where c;, d; € {O, 1} for all 1 <i <n.
                              First we find thatx + y = yr   s;x,, where s; = c; + d; foreach 1 <i <n. (Remember
                           that here 1 + 1 = 1.) Consequently,

fixty)={i[l <i <nands; = 1}
                                                          = {i{1 <i
                                                                  <n and (c; = | ord; = 1)}
                                                          = {ijl <i<nandc,                       = 1}U          {i|1<i<nandd;               = 1}

= f(x) U f(y).
                           Theorem 15.5(b) tells us that
                                                                                                  n

Xoy=            5 E;X;,
                                                                                                 i=l
                           where t; = c;d; for all 1 <i <x, and so, in a similar way, we get

f(x -y) = {iJl <i <nand 4 = 1}
                                                          = {i]1 <i <n and (c; = 1 andd, = 1)}
                                                          = {i|1<i<nandc;                        = 1}N{i|1          <i   <n andd;           = 1}

= fF).
                              To complete the proof that f is an isomorphism, we should first observe that if x =
                           SoP_, cixj, then  = )°"_, ;x;. This follows from Theorems 15.3(e) and 15.5(b) because
                                                     n                 n                    n                             n

>     CjX;   +   Yo        Gx     =   Sci        +    C;) x;    =   Yo xi       =]
                                                    1=1               i=l                  i=]                           i=l
                                                                                        15.4. The Structure of a Boolean Algebra (Optional)                    741

(Why is this true? See Exercise 15 for this section.) and

(>          os]     (>       aa      =   x      Ci Ci X;   =   3     Ox;   =    0,
                                                             i=]                  i=]                 i=]                  i=]

Now we find that
                                                                               f() = {ijl <i <nande;                   = 1}
                                                                                     = {ijl <i<nandc;
                                                                                                    =0}
                                                                                     = {i]l<i<nandc;                   = 1}

= f(x),
                                 so the function f preserves the operations in the Boolean algebras & and PU).
                                    We leave to the reader the details showing that f is one-to-one and onto.

10. If & is a Boolean algebra, prove that the zero element and
                          93 eh Ae Le |                                              the one element of 9 are unique.

1. Verify the second distributive law and the identity and in-                     11. Let f: 8, — B, be an isomorphism of Boolean algebras.
verse laws for Example 15.25.                                                        Prove each of the following:
2. Complete the proof of Theorem 15.3.
                                                                                          a) f(0) =0.                             b) f()
                                                                                                                                       = 1.
3. Let % be the set of positive integer divisors of 210, and
                                                                                           c) Ifx, y € B, withx < y, then in Bs, f(x) < fy).
define +,-, and” for B by x + y = Iem(x, y), x-y=xy=
gcd(x, y), and x = 210/x. Determine each of the following:                                d) Ifx is an atom of &%,, then f(x) is an atom in Bp.

a) 30+5-7                         b) (04+5)- (3047)                              12. Let &%, be the Boolean algebra of all positive integer di-
                                                                                     visors of 2310, with 8            the Boolean         algebra of all subsets of
    c) (144+ 15)                      d) 21(2+ 10)
                                                                                     {a, b, c, d, e}.
    e) (24+3)4+5                      f) (6+ 35)(7 + 10)                                  a) Define      f:%, > B2          so that f(2) = {a}, f(3) = {3},
  4. For a Boolean algebra & the relation “<” on &, defined                               f(5) = te}, £7) = {d}, FCI) = {fe}. For f to be an
by x < y if xy = x, was shown to be a partial order. Prove                                isomorphism,       what must the images of 35,              110, 210, and
that: (a) ifx < y thenx + y = y; and (b)ifx < y theny <x.                                 330 be?
5. Let (B, +, +, , 0, 1) be a Boolean algebra that is partially                          b) How many different isomorphisms can one define be-
ordered by <.                                                                             tween %, and RB»?
    a) Ifw € Band w <0, prove that w = 0.                                            13. a) If B,, B. are Boolean algebras and f:%, > B®, is
    b) Ifx € Band     | <x, prove thatx = 1.                                             one-to-one, onto, and such that f(x + y) = f(x) + fO)
    c) If y, z € B with y < z and y < Z, prove that y = 0.                               and f(x) = f(x), forall x, y € Bj, prove that f is an iso-
                                                                                         morphism.
  6. Let (B, +, -, ~, 0, 1) be a Boolean algebra that is partially
ordered by <. If w, x, y, z € B with w <x and y <z, prove                                 b) State and prove another result comparable to that in
that (a) wy < xy; and(b)w+y<x+4+2z.                                                       part (a). (What principle is used here?)

7. If B   is a Boolean    algebra,   partially   ordered   by     <,    and         14, Prove that the function f in Theorem                    15.9 is one-to-one
x, y € B, what is the dual of the statement “x < y’?                                 and onto.

8. Let F, = {f: B" > B} be the Boolean algebra of all
                                                                                     15. Let & be a finite Boolean algebra with the n atoms x,
Boolean functions on n Boolean variables. How many atoms
                                                                                     X2,...,X,. (So |B = 2”.) Prove that
does F,, have?
9, Verify Theorem 15.5(b).                                                                                   P= x, +x. +---+%,.
742       Chapter 15   Boolean Algebra and Switching Functions

15.5
      Summary and Historical Review
                            The modern concept of abstract algebra was developed by George Boole in his study
                            of genera! abstract systems, as opposed to particular examples of such systems. In his
                            1854 publication An Investigation of the Laws of Thought, he formulated the mathematical
                            structure now called a Boolean algebra. Although abstract in nature during the nineteenth
                            century, the study of Boolean algebra was investigated in the twentieth century for its
                            applicative value.
                                Starting in 1938, Claude Elwood Shannon (1916-2001) made the first major contribution
                            in applied Boolean algebra in [8]. He devised the algebra of switching circuits and showed
                            its relation to the algebra of logic. Additional developments that were made in this area
                            during the 1940s and 1950s are noted in the paper by C. E. Shannon [9] and in the report of
                            the Harvard University Computation Laboratory [10]. (The computer term bit was coined
                            by Claude E. Shannon, who was also one of the first to represent information in terms of
                            bits.)

Claude Elwood Shannon (1916-2001)

We found that switching functions can be represented by their disjunctive and conjunctive
                            normal forms. These forms allowed us to write such functions in a compact way using binary
                            labels. The minimization process showed us how to represent a given Boolean function as
                            a minimal sum of products or as a minimal product of sums. Based on the map method by
                            E. W. Veitch [11], Maurice Karnaugh’s modification [4] was developed here as a pictorial
                            method for the simplification of Boolean functions. Another technique that we mentioned
                            in the text is the tabulation algorithm known as the Quine-McCluskey method. Originally
                            developed by Willard Van Orman Quine (1908-2000) [6, 7], this technique was modified
                            by Edward J. McCluskey, Jr. (1929- ) [5]. It is very useful for functions with more than
                            six variables and lends itself to computer implementation. The interested reader can find
                            more about Karnaugh maps in Chapter 6 of F. J. Hill and G. R. Peterson [3]. Chapter7
                            of [3] provides an excellent coverage of the Quine-McCluskey             method. J. F Wakerly [12]
                            examines digital circuits in the light of contemporary technology, whereas T. L. Booth [1]
                            investigates some specific applications of logic design in the study of computers. A more
                                                                                                             Supplementary Exercises          743

advanced coverage of the applications given in this chapter (along with many other related
                              concepts) is given in the text by K. G. Gopolan [2].
                                  Although the major part of this chapter was applied in nature, Section 15.4 found us
                              investigating the structure of a Boolean algebra. Unlike commutative rings with unity,
                              which come in all possible sizes, we found that a Boolean algebra can contain only 2”
                              elements, where n € Z*+. Uniqueness of representation appeared as we found the atoms
                              of a Boolean algebra used to build the rest of the algebra (except for the zero element).
                              The Boolean algebra of sets that we studied in Chapter 3 was found to represent all finite
                              Boolean algebras in the sense that a finite Boolean algebra with n atoms is isomorphic to
                              the Boolean algebra of all subsets of {1, 2, 3,..., n}.

REFERENCES
                                   1, Booth, Taylor L. Digital Networks and Computer Systems, 2nd ed. New York: Wiley, 1978.
                                   2 . Gopolan, K. Gopal. /ntroduction to Digital Microelectronic Circuits. Chicago: Irwin, 1996.
                                   3. Hill,   Frederick   J., and   Peterson,   Gerald     R. Introduction    to Switching   Theory    and Logical
                                       Design, 3rd ed. New York: Wiley, 1981.
                                     . Karnaugh, Maurice. “The Map Method for Synthesis of Combinational Logic Circuits.” Trans-
                                       actions of the AIEE, part I, vol. 72, no. 9 (1953): pp. 593-599.
                                     . McCluskey, Edward J., Jr. “Minimization of Boolean Functions.” Bell System Technical Jour-
                                       nal 35, no. 6 (November 1956): pp. 1417-1444.
                                     . Quine, Willard V. “The Problem of Simplifying Truth Functions.” American Mathematical
                                       Monthly 59, no. 8 (October 1952): pp. 521-531.
                                     . Quine, Willard V. “A Way to Simplify Truth Functions.” American Mathematical Monthly 62,
                                       no. 9 (November 1955): pp. 627-631.
                                     . Shannon, Claude E. “A Symbolic Analysis of Relay and Switching Circuits.” Transactions of
                                      the AIEE, vol. 57 (1938): pp: 713-723.
                                     . Shannon, Claude E. “The Synthesis of Two-terminal Switching Circuits.” Bell System Tech-
                                       nical Journal, vol. 28 (1949): pp. 59-98.
                                  10. Staff of the Computation Laboratory. Synthesis of Electronic Computing and Control Circuits,
                                       Annals 27. Cambridge, Mass.: Harvard University Press, 1951.
                                  ll. Veitch, E. W. “A Chart Method for Simplifying Truth Functions.” Proceedings of the ACM.
                                       Pittsburgh, Penn. (May 1952): pp. 127-133.
                                  12. Wakerly, John F. Digital Design: Principles and Practices, 2nd ed. Englewood Cliffs, N.J.:
                                       Prentice-Hall, 1994,

b) If Kathleen is invited, Nettie and Margaret must also be
          SUPPLEMENTARY EXERCISES                                               invited.

c) She can invite Cathy or Joan, but definitely not both of
                                                                                them.
  1. Let n > 2. If x, is a Boolean variable for all 1 <i <a,
prove that                                                                      d) Neither Cathy nor Nettie will show up if the other is not
                                                                                invited.
    a) (41 + 2X2 +++ + Xn) = X1XQ+ + Xp
                                          _                                     e) Either Kathleen or Nettie or both must be invited.
    b) (1X2 +++ Xn) =X +X. +-- -+%                                                                                               .
                                   "                                          Determine which subsets of these five friends Eileen can
2. Let f, g:B°— B be Boolean functions,              where    f =        invite to her party and still satisfy conditions (a) through (e).
>. m(1, 2,4, 7, x) and g = - m(O, 1, 2, 3, y, z, 16, 25). If
                                                                            4. Let f, g: Bt > B, where f = }° m(2, 4, 6, 8), and
f < g, what are x, y, z?
                                                                           g= > m(1, 2, 3,4, 5, 6, 7, 8, 9, 11, 13, 15). Find
                                                                                                                            a function
3. Eileen is having a party and finds herself confronted with             h: B* —> B such that f = gh.
decisions about inviting five of her friends.                               5. Let & bea Boolean algebra that is partially ordered by <. If
    a) If she invites Margaret, she must also invite Joan.                x, y, z € B, prove thatx + y < zifandonlyifx <zandy < z.
744            Chapter 15 Boolean Algebra and Switching Functions

6. State and prove the dual of the result in Exercise 5.               c) If this network is part of a larger network and, conse-
                                                                        quently, the binary equivalents of the numbers 10 through
7. Let & be a Boolean algebra that is partially ordered by <.          15 never occur as inputs, design a two-level gating network
For all x, y € & prove that                                             for g in this case.
      a) x < yif and only if x + y = 1; and                         11. For n Boolean variables there are 27” Boolean functions,
      b) x < y if and only if xy = 0.                               each of which can be represented by a function table.
8. Let x, y be elements in the Boolean algebra &%. Prove that          a) ABoolean function f on the n variables x), x2, ..., Xn
x = yif and only ifxy+xy = 0.                                           is called self-dual if

9. Use a Karnaugh map to find a minimal-sum-of products                         Fm,   N2,   2-5,   Xn)   =   FO,   X2,   tay   Xn).

representation for                                                      How many Boolean functions on ” variables are self-dual?
      a) f(w, x,y,z) = >> m(0, 1, 2, 3, 6, 7, 14, 15)                   b) Let f: B’ > B. Then f is called symmetric if
      b) g(v, w, x,y,z) =1] MA, 2, 4, 6, 9, 10, 11, 14, 17,                fO,y. 2) = FO, 2, y) = FO, X, 2)
      18, 19, 20, 22, 25, 26, 27, 30)
                                                                                    = f(y, 2, x) = f(x,y) = f(z, y, x).
10. The four input lines for the network in Fig. 15.15 provide
the binary equivalents of the numbers 0, 1, 2,..., 15, where            So the value of f is unchanged when we rearrange the three
each number is represented as abce, with e the least significant        columns of values listed under x, y, and z in the table for
                                                                        f. How many such functions are there on three Boolean
bit.
                                                                        variables? How many are there on nz Boolean variables?

@a—|                                             12. Let &, be the Boolean algebra of all positive integer divi-
                                 Power
                                                                    sors of 30030, and let 32 be the Boolean algebra of all subsets
                   bc——»>      x. f t Wo    |+——_>   9g
                    e     »|     detector                           of {u, v, w, x, y, z}. How many isomorphisms f: 8, > B
                                                                    satisfy f(2) = {u} and f(6) = {u, v}?
                   Figure 15.15                                     13. For (a) n = 60, and (b) n = 120, explain why the posi-
                                                                    tive integer divisors of n do not yield a Boolean algebra. (Here
      a) Find the d.n-f. of g, whose value is 1 exactly when abce   x+y=Icm(, y), xy = ged(x, y), x =n/x, | is the zero el-
      is the binary equivalent of 1, 2, 4, or 8.                    ement, and n is the one element.)
      b) Draw the two-level gating network for g as a minimal       14. Let a, b, c€ B, a Boolean algebra. Prove that ab +c =
      sum of products.                                              a(b + c) if and only ifc <a.
                   16
  Groups, Coding
Theory, and Polya’s
     Method of
    Enumeration

I: our study of algebraic structures we examine properties shared by particular mathemat-
                        ical systems. Then we generalize our findings in order to study the underlying structure
                     common to these particular examples.
                          In Chapter 14 we did this with the ring structure, which depended on two closed binary
                     operations. Now we turn to a structure involving one closed binary operation. This structure
                     is called a group.
                          Our study of groups will examine many ideas comparable to those for rings. However,
                     here we shall dwell primarily on those aspects of the structure that are needed for applications
                     in cryptology, coding theory, and a counting method developed by George Polya.

16.1
         Definition, Examples,
      and Elementary Properties

Definition 16.1     If G is anonempty set and o is a binary operation on G, then (G, ©) is called a group if the
                     following conditions are satisfied.
                          1) For all a, b€ G,a ob €G. (Closure of G under o)
                          2) For alla, b, c€ G,ao    (boc)   = (aob)    oc. (The Associative Property)
                          3) There exists e € G   with aoe =e0a        =a,   for all ae G.    (The Existence    of an
                             Identity)
                          4) For each a € G there is an element b € G such that a ob = boa = e. (Existence of
                             Inverses)

Furthermore, ifa ob = boa foralla, b € G, then G       is called a commutative, or abelian,
                     group. The adjective abelian honors the Norwegian           mathematician    Niels Henrik Abel
                     (1802-1829).

745
746          Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

We realize that the first condition in Definition 16.1 could have been omitted if we simply
                             required the binary operation for G to be a closed binary operation.

Following Definition 14.1 (for a ring) we mentioned how the associative laws for the
                             closed binary operations of + (ring addition) and - (ring multiplication) could be extended
                             by mathematical induction. The same type of situation arises for groups. If (G, 0) is any
                             group, and r,n € Z* withn > 3 and 1 <r <n, then
                                      (@) 042 0--:0d;,) 0 (G;41           O° ++ OA_)   =A)     0020-++        04, O44,       0+°+ OA,

where a), do,..., Gy, Gy41,--., Gy are all elements from G.

Under ordinary addition, each of Z, Q, R, C is an abelian group. None of these is a group
      EXAMPLE 16.1
                             under multiplication because 0 has no multiplicative inverse.                      However,      Q*, R*, and C*
                             (the nonzero elements of Q, R, and C, respectively) are abelian groups under ordinary
                             multiplication.
                                 If (R, +, +) is a ring, then (R, +) is an abelian group; the nonzero elements of a field
                             (F, +, -) form the abelian group (F%, -).

For n € Z*,n > 1, we find that (Z,, +) is an abelian group. When p is a prime, (Z*, -) is
      EXAMPLE 16.2
                             an abelian group. Tables 16.1 and 16.2 demonstrate this for n = 6 and p = 7, respectively.
                             (Recall that in Z, we often write a for [a] = {a + kn|k € Z}. The same notation is used
                             in Z*.)
                                  P

Table 16.1                                                      Table 16.2

+         0   1      2        3       4      5                            1    2        3      4       5        6
                                 0        0    1     2        3       4      5                  1         1    2        3      4       5        6
                                 1        1   2      3        4       5      0                  2         2    4        6      1       3        5
                                 2        2   3      4        5       0      1                  3         3    6        2      5       ]        4
                                 3        3   4      5        0       1      2                  4         4    1        5      2       6        3
                                 4        4   5      0        1       2      3                  5         5    3         1     6       4        2
                                 5        5   0       1       2       3      4                  6         6    5        4      3       2         1

Definition 16.2        For every group G the number of elements in G is called the order of G and this is denoted
                             by |G|. When the number of elements in a group is not finite we say that G has infinite
                             order.

EXAMPLE 16.3       |   For all n € Z*, |(Z,, +)| =n, while |(Zi. -)| = p — 1 for each prime p.

EXAMPLE 16.4       |   Let us start with     the ring       (Zo, +, -) and consider the subset
                             in Zo} = {a € Zo| a~'exists} = {1, 2, 4,5, 7, 8} ={aeZ*|l<a<
                                                                                                                   Uy = {a € Zo| a is a unit
                                                                                                                       8 and gced(a,       9) = f}.
                             The results in Table 16.3 show us that Uo is closed under the multiplication for the ring
                              (Zo, +, +) —namely,         multiplication modulo 9. Furthermore, we also see that 1 is the iden-
                             tity element and that each element has an inverse (in Ug). For instance, 5 is the inverse for
                             2, and 7 is the inverse for 4. Finally, since every ring is associative under the operation
                                                    16.1   Definition, Examples, and Elementary Properties        747

of (ring) multiplication, it follows that a - (b+ c) = (a+ b)-c for all a, b, c € Ug. Conse-
               quently, (Uo, -) is a group of order 6 — in fact, it is an abelian group of order 6.

Table 16.3

1      2     4     5      7       8

l  ]          2     4     =5     7       8
                                             2   2          4     8      1     5       7
                                             4/4            8     7F    2      1       5
                                             5      5        1    2     7      8       4
                                             7      7       5     1     8      4       2
                                             8      8       7F    5     4      2       1

In general, for each n € Z*, where n> 1, if U, = {a €(Z,, +, +)| @ is a unit} =
               fae Zt|\1<a<n-—1 and ged(a,n) = 1}, then (U,, -) is an abelian group under the
               (closed) binary operation of multiplication modulo n. The group (U;,, +) is called the group
               of units for the ring (Z,,, +, -) and it has order #(n), where ¢ denotes the Euler phi function
               of Section 8.1.

From here on the group operation will be written multiplicatively, unless it is given
               otherwise. So a o b now becomes ab.
                  The following theorem provides several properties shared by all groups.

THEOREM 16.1   For every group G,
                 a) the identity of G is unique.
                 b) the inverse of each element of G is unique.
                 c) ifa, b, c € G and ab = ac, then b = c. (Left-cancellation property)
                 d) if a, b, c € G and ba = ca, then b = c. (Right-cancellation property)
               Proof:
                 a) If e;, e2 are both identities in G, then e; = e,e2 = ep. (Justify each equality.)
                 b) Let a € G and suppose that b, c are both inverses of a. Then b = be = b(ac) =
                    (ba)c = ec = c. (Justify each equality.)
               The proofs of properties (c) and (d) are left for the reader. (It is because of these properties
               that we find each group element appearing exactly once in each row and each column of
               the table for a finite group.)

On the basis of the result in Theorem 16.1(b) the unique inverse of a will be designated
               by a~'. When the group is written additively, —a is used to denote the (additive) inverse
               of a.
                   As in the case of multiplication in a ring, we have powers of elements in a group. We
               define a® = e, a! = a, a* =a-a,           and in general a"t!       = q” -a, for all n EN.    Since each
               group element has an inverse, for n € Z*, we define a~”               = (a~')". Then a” is defined for
               all n € Z, and it can be shown that for all m,n € Z,a™ - a" = a’ *" and (a’")" =a".
748          Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

If the group operation is addition, then multiples replace powers and for all m,n € Z,
                             and all a € G, we find that

ma+na=(m+n)a                      m(na) = (mn)a.

In this case the identity is written as 0, rather than e. And      here, for all a € G, we have
                             Oa = 0, where the “O” in front of a is the integer 0 (in Z) while the “0” on the right side of
                             the equation is the identity 0 (in G). [So these two “O”’’s are different.|
                                 For an abelian group G we also find that for all m € Z and alla, b € G, (1) (ab)" = a"b",
                             when G is written multiplicatively; and (2) n(a + b) = na + nb, when the additive notation
                             is used for G.

We now take a look at a special subset of a group.

Let G = (Ze, +). If H = {0, 2, 4}, then H is a nonempty           subset of G. Table   16.4 shows
      EXAMPLE 16.5
                             that (H, +) 1s also a group under the binary operation of G.

Table 16.4

+         0   2   4

0         0   2   4
                                                                      2         2   4   0
                                                                      4         4   0   2

This situation motivates the following definition.

Definition 16.3        Let G be a group and # # H CG. If H is a group under the binary operation of G, then
                             we call H a subgroup of G.

a) Every group G has {e} and G as subgroups. These are the trivial subgroups of G. All
      EXAMPLE 16.6
                                   others are termed nontrivial, or proper.
                                b) In addition to H = {0, 2, 4}, the subset K = {0, 3} is also a (proper) subgroup of
                                   G = (Ze, +).
                                c) Each of the nonempty subsets {1, 8} and {1, 4, 7} is a subgroup of (Uo, -).
                                d) The group (Z, +) is a subgroup of (Q, +), which is a subgroup of (R, +). Yet Z*
                                   under multiplication is not a subgroup of (Q”*, +). (Why not?)

For a group G and 6 # H CG, the following tells us when H is a subgroup of G.

THEOREM 16.2                 If H is anonempty subset of a group G, then H is a subgroup of G if and only if (a) for all
                             a,bé€ H,abeé H, and (b) forallae H,a'! eH.
                             Proof: If H is a subgroup of G, then by Definition 16.3 H is a group under the same binary
                             operation. Hence it satisfies all the group conditions, including the two mentioned here.
                             Conversely, let 6 # H CG with H satisfying conditions (a) and (b). For all a, b, cE H,
                             (ab)c = a(bc) in G, so (ab)c = a(bc) in H. (We say that H “inherits” the associative
                                                               16.1 Definition, Examples, and Elementary Properties                        749

property from G.) Finally, as H # %, leta € H. By condition (b), a~! € H and by condition
                  (a), aa~! =e € H, so H contains the identity element and is a group.

A finiteness condition modifies the situation.

THEOREM 16.3      If G is a group and # # H CG, with H finite, then H is a subgroup of G if and only if H
                  is closed under the binary operation of G.
                  Proof: As in the proof of Theorem                16.2, if H   is a subgroup of G, then A is closed under
                  the binary operation of G. Conversely, let H be a finite nonempty subset of G that is so
                  closed. If a € H, then aH = {ah|h € H} C A because of the closure condition. By left-
                  cancellation in G, ah)      = ahy > hy = ho, so |aH|                        = |H|. WithaH   CH     and |aH|         =|A|,
                  it follows from H being finite that aH = H. Asa € H, there exists b € H with ab =a.
                  But (in G) ab = a = ae, so b =e and H contains the identity. Since e € H = aH, there
                  is an element c € H such that ac = e. Then (ca)* = (ca)(ca) = (e(ac))a = (ce)a = ca =
                  (ca)e, so ca = e, and c =a! € H. Consequently, by Theorem 16.2, H is a subgroup
                  of G.

The finiteness condition in Theorem 16.3 is crucial. Both Z* and N are nonempty closed
                  subsets of the group (Z, +), yet neither has the additive inverses needed for the group
                  structure.

The next example provides a nonabelian group.

Consider the first equilateral triangle shown in Fig. 16.1(a). When we rotate this triangle
   EXAMPLE 16.7
                  counterclockwise (within its plane) through 120° about an axis perpendicular to its plane
                  and passing through its center C, we obtain the second triangle shown in Fig. 16.1(a).
                  As a result, the vertex originally labeled 1 in Fig. 16.1(a) is now in the position that was
                  originally labeled 3. Likewise, 2 is now in the position originally occupied by 1, and 3
                  has moved to where 2 was. This can be described by the function 7: {1, 2, 3} >                                    {1, 2, 3},
                  where 7(1) = 3, (2) = 1, 2)(3) = 2. A more compact notation, (;                                    7       3), where we
                  write 2; (i) below      i for each    1 <7       <3,   emphasizes that 7; is a permutation of {1, 2, 3}.
                  If 22 denotes the counterclockwise                rotation through             240°, then 22 = (       ;    ;). For the
                  identity 779
                            — that is, the rotation through n(360°) for n € Z—we write m =(;                                           5    4).
                  These rotations are called rigid motions of the triangle. They are two-dimensional motions
                  that keep the center C fixed and preserve the shape of the triangle. Hence the triangle looks
                  the same as when we started, except for a possible rearrangement of the labels on some of
                  its vertices.

2                        3                                 2                   1

4                                 N                      ry
                                            ———_>                                         A             —_>

1             3.    2                    1            1                 3>2                     3
                          (a)                                                   (b)
                        Figure 16.1
750         Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

In addition to these rotations, the triangle can be reflected along an axis passing through
                            a vertex and the midpoint of the opposite side. For the diagonal axis that bisects the base
                            angle on the right, the reflection gives the result in Fig. 16.1(b). This we represent by
                            n=(6      7 3): A similar reflection about the axis bisecting the left base angle yields the
                            permutation r2 = (                        3 5). When the triangle is reflected about its vertical axis, we have
                            r3=(3             3        }).Eachr,, for 1 <i <3, isa three-dimensional rigid motion.
                                   Let G = {m0, 7, 42.11, 72, 73}, the set of rigid motions                                            (in space) of the equilateral
                            triangle. We make G into a group by defining the rigid motion @f, for a, B € G, as that
                            motion obtained by applying first a and then following up with 8. Hence, for example,
                                  \r, = r3. We can see this geometrically, but it will be handy to consider the permutations
                            as follows: zr) = G                         7        G6         3    3), where, for example, 7) (1) = 3 and r;(3) = 3 and
                            we write 1 —> 3 —> 3.So 1 ——> 3 inthe product 7;r,. (Note that the order in which we
                            write the product zr; here is the opposite of the order for their composite function as defined
                            in Section 5.6. The notation of Section 5.6 occurs in analysis, whereas in algebra there is a
                            tendency to employ this opposite order.) Also, since 2 1                                                          2 and3 “5 2-5           Lit
                            follows that mir) = (3.                         3   }) =r.
                                   Table 16.5 verifies that under this binary operation G is closed, with identity zg. Also
                            my!         = 72, Wy l= wry, and every other element is its own                                          inverse. Since the elements of
                             G are actually functions, the associative property follows from Theorem 5.6 (although in
                            reverse order).

Table 16.5

ITO           Ty    IT2           ry         r2       r3

To            Io         I     IT?           r\         r2       V3
                                                                        IT]        IT)             2   ITO           F3         ry       a)
                                                                       IU3            U2         IT)   Ty            ro         rs       r|

r|           Fry         r2    63       IO            IT}      2
                                                                         r2           r2          r3    ry       U2            IQ       IT
                                                                         P3           r3          FY    r2       7]            IT2      IE)

We computed zr; as 73, but from Table 16.5 we see that rj)                                                 = rz. With mr, = 13 F
                            rz = r17, it follows that G is nonabelian.
                                This group can also be obtained as the group of all permutations of the set {1, 2, 3} under
                            the binary operation of function composition. It is denoted by $3 (the symmetric group on
                            three symbols).

The         symmetric             group     Sy consists              of the 24 permutations                 of {1, 2, 3, 4}. Here 2 =
      EXAMPLE 16.8
                             (i     5     3       4)     is    the      identity.           If a=()          7   3        3)     B=           7   3   4),   then    ap =
                             (i     3     G3) but Ba = (4                       5?              4).8o Sq is nonabelian. Also, 67! = Gad                            ‘) and
                            a’ = x9 = f°. Within S, there is a subgroup of order 8 that represents the group of rigid
                            motions for a square.

We turn now to a construction for making larger groups out of smaller ones.
                                                                         16.1     Definition, Examples, and Elementary Properties                     751

THEOREM 16.4                      Let (G, 0) and (#/, *) be groups. Define the binary operation - on G X H by (g), hj) +
                                  (g2, h2) = (g1 0 g2, hy * Az). Then           (G X H, -) is a group and is called the direct product
                                  of G and H.
                                  Proof: The verification of the group properties for (G X H, -) is left to the reader.

Consider the groups (Z2, +), (Z3, +). On G = Z X Z3, define (a), b,) - (ax, bz) =
    EXAMPLE 16.9
                                  (a, + a2, b; + 62). Then G is a group of order 6 where the identity is (0, 0), and the
                                  inverse, for example, of the element (1, 2) is (1, 1).

b) Make a group table for these rigid motions like the one
                           EXERCISES 16.1                                        in Table 16.5 for the equilateral triangle. What is the iden-
                                                                                 tity for this group? Describe the inverse of each element
  1. For each of the following sets, determine whether or not the
                                                                                 geometrically.
set is a group under the stated binary operation. If so, determine
its identity and the inverse of each of its elements. If it is not a     13. a) How many rigid motions (in two or three dimensions)
group, state the condition(s) of the definition that it violates.            are there for a regular pentagon? Describe them geometri-
                                                                             cally.
    a) {—1, 1} under multiplication
                                                                                 b) Answer part (a) for a regular n-gon, n > 3.
   b) {—1,    1} under addition
                                                                         14. In the group Ss, let
    c) {—1, 0, 1} under addition
                                                                                      12       3    4   5     and     B        123         4      5
   d) {10n{n € Z} under addition                                            a=                                  n         =                            .
                                                                                      23       1    4   5                      215         3      4
    e) The   set of all one-to-one      functions    g: A —   A, where
                                                                         Determine af, Ba, a7, B*, a~', Bo', (wB)7!, (Ba)7!, and
    A = {1, 2, 3, 4}, under function composition
                                                                         Bolan!.
    f) {a/2"|a, n € Z, n > 0} under addition
                                                                         15. If G is a group, let H = {a € Glag = ga for all
                                                                                                                          g € G}.
2. Prove parts (c) and (d) of Theorem 16.1.                             Prove that H is a subgroup of G. (The subgroup #              is called the
3. Why is the set Z not a group under subtraction?                      center of G.)

4, LetG = {g € Qlq # —1}. Define the binary operation o on              16. Let w be the complex number (1//2)(1 + i).
Gbyxoy =x + y + xy. Prove that (G, 0) 1s an abelian group.                       a) Show that w® = | but w” # 1 forn € Zt, 1 <n <7.
5. Define the binary operation o on Z by xoy=x+yH+1.                            b) Verify that {w"|n € Z*, 1 <n < 8} is an abelian group
Verify that (Z, o) is an abelian group.                                          under multiplication.

6. Let S = R* X R. Define the binary operation o on S                   17. a) Prove Theorem 16.4.
by (u, v) o (x, y) = (ux, vx + y). Prove that (5, 0) is a non-                   b) Extending      the idea developed         in Theorem       16.4 and
abelian group.                                                                   Example 16.9 to the group Z¢ X Zg X Zs = Z;, answer
  7. Find the elements in the groups U2) and U>4 — the groups                    the following.
of units for the rings (Zao, +, +) and (Zz4, +, -), respectively.                     i)   What is the order of this group?
                                                                                     ii)   Find a subgroup of Z; of order 6, one of order 12,
8. For any group G prove that G is abelian if and only if
                                                                                           and one of order 36.
(aby = a*b* for alla, be G.
                                                                                    iii)   Determine the inverse of each of the elements
9. If G is a group, prove that for all a, b € G,                                          (2, 3, 4), (4, 0, 2), (5, 1, 2).
    a) (a!)      '=a                  b) (ab)! = ba"!                    18. a) If H, K are subgroups of a group G, prove that H 1 K
10. Prove that a group G    is abelian if and only if forall a, b € G,       is also a subgroup of G.
(ab)"! = ao! bo.                                                                 b) Give an example of a group G with subgroups H, K
ll. Find all subgroups in each of the following groups.                          such that H U K is not a subgroup of G.

a) (Zi2, +)             b) (Zi, +)               c) 53                19, a) Find allx in (Z®, -) such thatx = x7~!.
12. a} How many rigid motions (in two or three dimensions)                   b) Find all x in (Z*,, +) such that x = x71.
    are there for a square?                                                      c) Letp bea prime. Find all x in (Z*, -) such thatx = x7!.
752            Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

d) Prove that (p — 1)! = —1 (mod p), forpa prime. [This                20. a) Findx in (U,, +) where x # 1, x #7 butx = xt,
      result is known as Wilson’s Theorem, although it was only                  b) Find x in (Uys, ») where x # 1, x # 15 butx =x74.
      conjectured by John Wilson (1741-1793). The first proof
                                                                                  c) Let ke Zt, k > 3. Find x in (Ux, +) where x #1,
      was given in 1770 by Joseph Louis Lagrange (1736—1813).]
                                                                                 x #2 —1 butx =x!.

16.2
        Homomorphisms, lsomorphisms,
              and Cyclic Groups
                               We turn our attention once again to functions that preserve structure.

Let G = (Z, +) and H = (Z4, +). Define f: G >                    H by
      EXAMPLE 16.10
                                                                      f(x)
                                                                        = [x] = {x + 4k|
                                                                                      k € Zh}.

For all x, y € G,

f(ix+y)=(4+                    y])=LeIl4+ lyl= f(@)
                                                                                                         + fo),
                                                               t                                             t
                                                        The operation in G                          The operation in #

where the second equality follows from the way the addition of equivalence classes was
                                developed in Section 14.3. Consequently, here f preserves the group operations and is an
                                example of a special type of function that we shall now define.

Definition 16.4          If (G, o) and (H,    *) are groups and f: G > H, then f is called a group homomorphism if
                                for alla, bE G, f(aob)             = fla)      f(b).

When we know that the given structures are groups, the function f is simply called a
                                homomorphism.
                                   Some properties of homomorphisms are given in the following theorem.

THEOREM 16.5                    Let (G, 0), (H, *) be groups with respective identities eg, ey. If f: G >                H is ahomomor-
                                phism, then

a) f(eg) =e.                                           b) f(a!) =[f(a)]~! for alla €G.
                                    c) f(a") = [f(a@)] for alla € G and allan € Z.
                                    d) f(S) is a subgroup of H for each subgroup S of G.

Proof:
                                         a) en * f(eg) = fleg) = fleg cecg) = fleg) * f(eg),                      so by right-cancellation
                                             [Theorem    16.1(d)], it follows that f(eg)        = ex.
                                    b) & c) The proofs of these parts are left for the reader.
                                                     16.2. Homomorphisms, lsomorphisms, and Cyclic Groups               753

d) If S is a subgroup of G, then S 4 @, so f(S) # @. Let x, y € f(S). Then x =
                            f(a), y = f(b), for some a, be S. Since S is a subgroup of G, it follows
                            that aobeS,       so xx y= f(a) * f(b) = f(aob) € f(S). Finally, x7! =
                            [f(a)]"' = f(a!) € f(S) because a~! € S when a € S. Consequently, by
                            Theorem 16.2, f(S) is a subgroup of H.

Definition 16.5   If f: (G, 0) > (H, *) is ahomomorphism, we call f an isomorphism if it is one-to-one
                  and onto. In this case G, H are said to be isomorphic groups.

Let f: (R*, -) > (R, +) where f (x) = log, (x). This function is both one-to-one and onto.
EXAMPLE 16.11
                  (Verify these properties.)    For all a,b ER*,              f (ab) = log, (ab) = logy a + log, b =
                  f(a) +f (b). Therefore, f is an isomorphism and the group of positive real numbers under
                  multiplication is abstractly the same as the group of all real numbers under addition. Here
                  the function f translates a problem in the multiplication of real numbers (a somewhat diffi-
                  cult problem without a calculator) into a problem dealing with the addition of real numbers
                  (an easier arithmetic consideration). This was a major reason behind the use of logarithms
                  before the advent of calculators.

Let G be the group of complex numbers {1, —1, 4, —/} under multiplication. Table 16.6
EXAMPLE 16.12
                  shows the multiplication table for this group. With H = (Z4, +), consider f: G —                    H de-
                  fined by

FO
                                 =                   f-D=2)                   FM)                  fi) =[3).
                    Then f(@)(—)) = FC) = [0] = 114+ 1B) = f@ + F(-2), and f((—-1)(-i)) = FO
                  = [1] = (21+ [3] = f(-1) + f(—2).
                      Although we have not checked all possible cases, the function is an isomorphism. Note
                  that the image under f of the subgroup {1, —1} of G is {[0], [2]}, a subgroup of H.

Table 16.6

1   —1              i   —i

1         1   —1              i   —I
                                                 —          —1            1     -1            i
                                                       i         L   —i         —1            1
                                                  —I        —1            I          1   —1

Let us take a closer look at this group         G. Here i} =i,i7             = -1,i°? = -i, and i* =     1,
                  so every element of G is a power of 7, and we say that i generates G. This is denoted by
                  G = (i). (tis also true that G = (—i). Verify this.)

The last part of the preceding example leads us to the following definition.

Definition 16.6   A group G 1s called cyclic if there is an element x € G such that for each a € G, a = x" for
                  some n € Z.
754           Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

a) The group H = (Z4, +) is cyclic. Here the operation is addition, so we have multiples
      EXAMPLE 16.13                 instead of powers. We find that both [1] and [3] generate H. For the case of [3], we
                                    have 1 - [3] = [3], 2-13] (= [314+ [3D = [21, 3 - [3] = [1], and 4- [3] = [0]. Hence
                                     H = {[3]) = ([1]).
                                 b) Consider the multiplicative group Us = {1, 2, 4, 5, 7, 8} that we examined in Exam-
                                    ple 16.4. Here we find that 21=2            27 =4    23 =8    24=7,    2   =5,   2° = 1, so Us is
                                    a cyclic group of order 6 and Uy, = (2). It is also true that Uo = (5) because 51 =5,
                                    5° = 7,59 =8,54 =4,5° =2,5°=1.

The concept of a cyclic group leads to a related idea. Given a group G, if a € G consider
                               the set S = {a*|k € Z}. From Theorem 16.2 it follows that S is a subgroup of G. This
                               subgroup is called the subgroup generated by a and is designated by (a). In Example 16.12
                               (i) = (-i) = G; also, (—1) = {-1, 1} and (1) = {1}. For part (a) of Example 16.13 we
                               consider multiples instead of powers and find that H = ([1]) = ((31), (21) = {[0], [2]},
                               and ({0]) = {[0]}. When we examine the group Us in part (b) of that example we see that
                               Uy = (2) (or ([2])) = (5), (4) = (1, 4, 7} = (7), (8) = (1, 8}, and (1) = {1}.

Definition 16.7         If G is a group and a € G, the order of a, denoted 6(a), is |(a)|. (If |(a}| is infinite, we say
                               that a has infinite order.)

In Example    16.12, o(1) = 1, c(—1)     = 2, whereas bothi and —i have order 4.

Let us take a second look at the idea of order for the case where |(a)| is finite. When
                               \(a)| = 1 thena = e becausea = a! € (a) ande =a" € (a). If |(a)| is finite buta # e, then
                               (a) = {a™|m € Z} is finite, so {a, a7, a®, .. .} = {a |m € Z*} is also finite. Consequently,
                               there exist s, t € Z+, where 1 <5 <¢t and a’ = a’ —from which it follows that a’ = e,
                               with tf —s €Z*. Since e € {a”|m € Z*}, let n be the smallest positive integer such that
                               a” = e. We claim that (a) = {a, a7, a®,..., a”~!, a” (= )}.
                                  First we observe that |{a, a”, a*,..., a”~', a” (= e)}| =n. Otherwise, we have a” =
                               a’ for positive integers u, v where       1 <u    <v     <n,   and then a”“ =e        withO<v—u<
                               n. This, however, contradicts the minimality of n. So now we know that |{a}| > n. But
                               for each k € Z, it follows from the division algorithm that k = gn +r, where 0 <r                  <a,
                               and so ak = af”+" = (a")4(a") = (e4)(a") = a" € fa, a? a?,..., a"!, a" (=e =a")}.
                               Therefore, (a) = {a, a”, a*,..., a"~!, a" (= e)} and we can also define ¢(a) as the smallest
                               positive integer n for which a” = e. This alternative definition for the order of a group
                               element (of finite order) proves to be of value in the following theorem.

THEOREM 16.6                   Let a € G with o(a) =n. Ifk € Z and ak = e, then n|k.
                               Proof: By   the division algorithm    (again),    we have      k = gn +r,   for 0<r     <n,   and so it
                               follows that e = a* = a4"*" = (a")4(a") = (e4)(a") = a’. If0 < r <n, we contradict the
                               definition of n as e(a). Hence r = O and k = gn.

We now examine some further results on cyclic groups. The next example helps us to
                               motivate part (b) of Theorem 16.7.

It is known from part (b) of Example 16.13 that Ug = {1, 2, 4, 5, 7, 8} = (2). We use this
      EXAMPLE 16.14
                               fact to define the function f: Uy — (Ze, +) as follows:
                                                       16.2 Homomorphisms, lsomorphisms, and Cyclic Groups        755

fC) = [0]                      f2) =11]                 fA
                                                                                        = [2]
                               FO=f@=51                       fM=f2=41                 f= f2)=(1.
                  So, in general, for each a € Uy we write a = 2*, for some 0 < k <5, and have f(a) =
                  f (2) = [k]. This function f is one-to-one and onto and we find, for example, that f (2-5) =
                  fC) = [0] = 11] + 15] = f(2) + fS), and f(7-8) = f(2) = [1] = 14) + [3] = fF)+
                  f (8).
                      In general, for a, b in Ug we may write a = 2” and b = 2”, where 0<m                   <5   and
                  Q <n      <5. It then follows that

fla. b) = fQ™-2") = f2"™) = [m+n] = [m] + [nl = f@) + fF).
                  Consequently, the function f is an isomorphism and the groups Us and (Z6, +) are iso-
                  morphic.
                      [Note how the function f links the generators of the two cyclic groups. Also note that
                  the function g: Ug > (Ze, +) where

g(1) = [0]                    g(5) = [1]      g(7) = g(5°) = [2]
                                9(8)=9(5°)=[3]                g(4=8(5*)=14) = 2) = gS?) = 15]
                  is another isomorphism between these two cyclic groups.]

THEOREM 16.7      Let G be a cyclic group.

a) If |G| is infinite, then G is isomorphic to (Z, +).
                    b) If |G| =,      where n > 1, then G is isomorphic to (Z,, +).

Proof:

a) For G = (a) = {a*|k € Z}, let f: G > Z be defined by f(a*) = k. (Could we have
                           ak = a' with k # t? If so, f would not be a function.) For a”, a” € G, f(a” +a") =
                           fla”*") =m+n= f(a") + f(a"), so f isahomomorphism. We leave to the reader
                           the verification that f is one-to-one and onto.
                    b) If G = (a) = {a,a’,..., a’—', a” = e}, then the function f: G > Z, defined by
                           f (a*) = [k] is an isomorphism. (Verify this.)

If G = (g), G is abelian because g” - g” = g’"*" = g?*™ = g". 9” for all m,n € Z. The
  EXAMPLE 16.15
                  converse, however, is false. The group H of Table 16.7 is abelian, and ¢(e) = 1, G(a) =
                  e(b) = o(c) = 2. Since no element of H has order 4, H cannot be cyclic. (The group H is
                  the smallest noncyclic group and is known as the Klein Four group.)

Table 16.7

€    a     b     Cc

e      €   a      b    Cc
                                                          a      a   e€     Cc   b
                                                          b      b    Cc   e     a
                                                          C      c    b    a     e

Our last result concerns the structure of subgroups in a cyclic group.
756            Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

THEOREM 16.8                    Every subgroup of a cyclic group is cyclic.
                                Proof: Let G = (a). If H is a subgroup of G, each element of H has the form a*, for some
                                k €Z. For H # {e}, lett be the smallest positive integer such that a’ € H. (How do we know
                                such an integer f exists?) We claim that H = (a’). Since a’ € H, by the closure property
                                for the subgroup H, (a') C H. For the opposite inclusion, let b € H, with b = a‘, for some
                                s € Z. By the division algorithm, s = gt +r, whereg, r € ZandO <r <t. Consequently,
                                a’ = a%'*" andsoa’ = a-#'a* = (a')~¢b. H isa subgroup of G, soa’ € H = (a')4 EH.
                                Then with (a')~“%, b € H, it follows that a” = (a')"4b € H. But if a” € H withr                       > 0, then
                                we contradict the minimality oft. Hence r = 0 and b = a” = (a')4 € (a'), so H = (a’),a
                                cyclic group.

8. In S; find an element of order n, for all 2 <n <5, Also de-
                         EXERCISES 16.2                               termine the (cyclic) subgroup of S; that each of these elements
                                                                      generates.
1. Prove parts (b) and (c) of Theorem 16.5.

2. Laa=|_f 01 a
                                                                       9, a) Find all the elements of order 10 in (Z4y, +).
                                                                          b) Let G = (a) be acyclic group of order 40. Which ele-
                                                                          ments of G have order 10?
      a) Determine A”, A’, and A‘.
      b) Verify that {A, A*, A?, A*} is an abelian group under        10. a) Determine       U4,   the    group    of   units   for    the   ring
      ordinary matrix multiplication.                                     (Zy4, +, +).
      c) Prove that the group in part (b) is isomorphic to the            b) Show that U4 is cyclic and find all of its generators.
      group shown in Table 16.6.
                                                                      11. Verify that (Z*, -) is cyclic for the primes 5, 7, and 11.
3. If G = (Ze, +), H = (Z3, +), and K = (Z», +), find an
isomorphism for the groups H X K andG.                                12, For a group G, prove that the function f: G + G defined
                                                                      by f(a) = a    is an isomorphism if and only if G is abelian.
4. Let f: G -     H bea group homomorphism onto H. If G is
abelian, prove that H is abelian.                                     13. If f: G + H, g: H — K are homomorphisms, prove that
  5. Let (ZX Z, @) be the abelian group where (a, b)@                 the composite function go f:G— K, where (go f)(x) =
(c,d) = (a+c,b+d)—herea+c and b+d are computed                        g(f (x)), is a homomorphism.
using ordinary addition in Z—and let (G, +) be an addi-               14, For w = (1//2)(1 +i), let G be the multiplicative group
tive group. If f: Z x Z— G isa group homomorphism where               {w"\Inée Zt, 1 <n <8}.
fd, 3) = g; and f(3, 7) = go, express f (4, 6) in terms of g;
                                                                          a) Show that G is cyclic and find each elementx € G such
and   22-
                                                                          that (x) = G.
6. Let f: (ZX Z, 6) — (Z, +) be the function defined by
                                                                          b) Prove that G is isomorphic to the group (Zs, +).
f(x, y) =x — y. [Here (Z X Z, @) is the same group as in
Exercise 5, and (Z, +) is the group of integers under ordinary        15, a) Find    all   generators    of the   cyclic   groups     (Z12, +),
addition.]                                                                (Zio, +), and (Z4, +).
      a) Prove that f is a homomorphism onto Z.                           b) Let G = (a) with c(a) =n, Prove that a*, k € Z*, gen-
      b) Determine all (a, b) € Z X Z with f(a, b) = 0.                   erates G if and only if k and n are relatively prime.
      c) Find f~'(7).                                                     c) If G is a cyclic group of order n, how many distinct
                                                                          generators does it have?
      d) If E = {2n|n € Z}, whatis f-'(E)?
  7, Find the order of each element in the group of rigid motions     16. Let f: G — H be a group homomorphism. If a € G with
of (a) the equilateral triangle; and (b) the square.                  e(a) =n, and c(f(a)) = k (in A), prove that k|n.
                                                                                   16.3 Cosets and Lagrange’s Theorem            757

16.3
     Cosets and Lagrange’s Theorem
                       In the last two sections, for all finite groups G and subgroups H of G, we had || dividing
                       |G|. In this section we’ ll see that this was not mere chance but is true in general. To prove
                       this we need one new idea.

Definition 16.8     If H is a subgroup of G, then for each a € G, the setaH = {ah|h € H} is called a left coset
                       of H in G. The set Ha = {ha|h € H} is aright coset ofH inG.

If the operation       in G    is addition, we   write a + H     in place of aH,     where   a+     H =
                       {fa thlh      € H}.
                           When the term coset is used in this chapter, it will refer to a left coset. For abelian groups
                       there is no need to distinguish between left and right cosets. However, at the end of the next
                       example we’ll see that this is not the case for nonabelian groups.

If G is the group of Example 16.7 andH = {70, m, 72}, the cosetr;H = {rj70, 7171, 1172}
  EXAMPLE 16.16        = {r}, ro, 7r3}. Likewise we have mH =r3H = {ri, ro, 73}, whereas 9H = 2,H =
                       mH       =   H.
                            We see that |aH|        = |H| for each a    € G and that G = H Ur|A           isa partition of G.
                           For the subgroup K = {2o, ri}, we find r2K = {r2, m2} and r3K = {r3, m1}. Again a
                       partition of G arises: G = K UroK Ur3k. (Note: Kro = {mora, rir} = {ro, mi} FmK.)

For G = (Zj2, +) and H = {[0], [4], [8]}, we find that
  EXAMPLE 16.17
                                                 [0]
                                                   + H = {[0], [4], (8]} = [41+ 4 =|[8])+ H =H
                                                 [1] + A = {[1], [5], [91} = (5]4+ 4 = [9] +H
                                                 [2] + H = {[2]. [6], [10]} = [6] +         AH =[10)+ 4
                                                 (3]+ A = {(3], 17], (1} =W14+ # = [11] + 4,
                       and H U ({1] + #2) U ((2]| + A) U ([3] + A) is a partition of G.

These examples now prepare us for the following results.

LEMMA   16.1           If H is a subgroup of the finite group G, then for all a, b € G, (a) |aH| = ||; and (b) either
                       aH     =bH        oraH    N1bH     = &.
                       Proof:

a) Since aH = {ah\|h € H}, it follows that |aH| < |A|. If |aH| <|H|, we have ah, =
                               ah; with h,, h; distinct elements of H. By left-cancellation in G we then get the
                               contradiction h; = h;,so |aH| = |A|.
                            b) If aH 1 bH F YG, let c = ah, = bho, for some hy, hy € H. Ifx € aH, then x = ah
                               forsomeh € H,andsox = (bhyh;')h = b(h2hy'h) € bH,andaH C bH. Similarly,
                                y€bH=>          y = bhz,    for some   h3 €H   >     y=   (ah, h5')hy   = a(hyh;'h3)     € aH,     so
                                bH CaH. Therefore aH and bH are either disjoint or identical.
758              Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

We observe that ifg € G, then g € gH because e € H. Also, by part (b) of Lemma 16.1,
                                         G can be partitioned into mutually disjoint cosets.

At this point we are ready to prove the main result of this section.

THEOREM 16.9                             Lagrange’s Theorem. If G is a finite group of order n with H a subgroup of order m, then
                                         m divides n.
                                         Proof: If H = G the result follows. Otherwise m < n and there exists anelementa € G — H.
                                         Sincea ¢ H,itfollowsthataH # H,soaHN H =@.IfG =aH U H,then|G| = |aH|+
                                         || = 2|H| and the theorem follows. If not, there is an element b € G — (H Ua), with
                                         bHO H =@$=bHAN4H and |bA| =|H|. If G=bH VaH UA, we have |G] = 3|H|.
                                         Otherwise we’re back to an element c € G withe ¢ bH UadH U H. The group G is finite,
                                         so this process terminates and we find that G = a; H Ua2H                  U---Ua,H. Therefore, |G| =
                                         k|H| and m divides n.

An alternative method for proving this theorem is given in Exercise 12 for this section.
                                            We close with the statements of two corollaries. Their proofs are requested in the Section
                                         Exercises.

COROLLARY 16.1                           If G is a finite group and a € G, then c(a) divides |G].

COROLLARY 16.2                           Every group of prime order is cyclic.

b)   How   many left cosets of H are there in G?
                                   > dC
                                      @ hOB om:
                                                                                         c) Consider the group (Z2 X Z.,@) where (a,b)@
  1. Let G = Sy. (a) Fora =( }                  ; 3 |), find the sub-                    (c,d) = (a@+c, b+ d)—and the sums a+c, b+d are
                                                                                         computed using addition modulo 2. Prove that H is iso-
group H = (a). (b) Determine the left cosets of H in G.
                                                                                         morphic to this group.
2, Answer Exercise                | for the case where @ is replaced by
   _f1            3        4                                                          8. If G is a group of order n and a € G, prove that a” = e.
P=(3 53 7 4).
             9

9, Let p be a prime. (a) If G has order 2p, prove that every
            1         2        3     4                                               proper subgroup of G is cyclic. (b) If G has order p?, prove that
3. Ify = ( >          |       4     3 ) € S,, how many cosets does (y)
                                                                                     G has a subgroup of order p.
determine?
                                                                                     10. Prove Corollaries 16.1 and 16.2.
4. For G = (Zz4, +), find the cosets determined by the sub-
group H = ({[3]}. Do likewise for the subgroup K = ([4}]).                           11. Let H and K be subgroups of a group G, where e          is the
                                                                                     identity of G.
5. Let G be a group with subgroups H and K. If |G| = 660,
|K| = 66, and K C H CG, what are the possible values for                                 a) Prove thatif|H| = 10and|K|       = 21,thenH 1 K = {e}.
|H |?                                                                                    b) If |H| = mand |X| =n,     with gcd(m, n) = 1, prove that
6. Let X be a ring with unity uv. Prove that the units of R form                        HOOK = {e}.
a group under the multiplication of the ring.                                        12. The following provides an alternative way to establish
7, Let G = S4, the symmetric group on four symbols, and let                         Lagrange’s Theorem. Let G be a group of order n, and let H
H be the subset of G where                                                           be a subgroup of G of order m.
H=        1234 1234  1234  1234                                                          a) Define the relation & on G as follows: If a, b € G, then
  “W123   47)2143/7 3 4127                                                 \4321)°       aR bifa-'b € H. Prove that R is an equivalence relation
      a) Construct a table to show that H is an abelian subgroup                         on G.
      of G.                                                                              b) Fora, b € G, prove thata R bif and only ifaH = bH.
                                                                                   16.4 The RSA Cryptosystem (Optional)           759

c) Ifa € G, prove that [a}, the equivalence class of a under           b) Euler’s   Theorem.   For each   n € Z*,n > 1, and each
   R, satisfies [a] = aH.                                                 a € Z, prove that if gcd(a, n) = 1, then a?” = 1(mod n).
   d) For each a € G, prove that |aH| = ||.                               c) How are the theorems in parts (a) and (b) related?
   e) Now establish the conclusion of Lagrange’s Theorem,                 d) Is there any connection between these two theorems and
   namely that |H| divides |G].                                           the results in Exercises 6 and 8?
13. a) Fermat’s Theorem. If p is a prime, prove that a? =a
   (mod p) for each a € Z. [How is this related to Exercise
   22(a) of Section 14.37]

16.4
     The RSA Cryptosystem (Optional)
                             This section provides us with an opportunity to use some of the theoretical ideas we en-
                             countered in Sections 14.3 and 16.3 ina more contemporary application.
                                In Example 14.15 of Section 14.3 we introduced two private-key cryptosystems: the
                             cipher shift and the affine cipher. For an alphabet of m characters, the encryption function
                              E: Zn, —      Z», for the cipher-shift cryptosystem, is given by E (0) = (6 + «) mod m, where
                             6,« € Zn, fork (# 0) fixed. (Using « = 0 would not alter any of the characters in a mes-
                             sage.) Consequently, there are m — 1 possibilities to examine in an attempt to discover the
                             value of the key «. Further, once we know the value of «, we also know the decryption func-
                             tion D: Z, > Z,,, for D(@) = (@ — «) mod m. Inthe case of the affine-cipher cryptosystem
                             (also with an alphabet of m characters) the encryption function FE: Z,, —> Zm is now given
                             by E(@)       = (#9 + «) mod m, where 6, a, k € Z,,, for fixeda, «, witha invertible in Z,,, [or,
                             equivalently, with gcd(a, m) = 1]. Here the decryption function D: Z, > Zm is given by
                             D(6) = [a~!(@ — «)] mod m. Without prior knowledge of the key (a, «), now one would
                             have to check m@(m) possibilities to discover the appropriate values of w and x for this
                             private-key cryptosystem.
                                The security of either of the above cryptosystems depends on having the key [be it « or
                             (a, «)] known only to the sender and the recipient of the messages.

The RSA cryptosystem is an example of a public-key cryptosystem. This cryptosystem
                             was developed in the 1970s (and patented in 1983) by Ronald Rivest (1948— ), Adi Shamir
                             (1952— ), and Leonard Adleman (1945— ). (Taking the first letter from the surname of each
                             of these three men provides the adjective RSA.)
                                We shall describe how this cryptosystem works and provide an example for encryption
                             and decryption. In so doing, we shall find ourselves using some of the results from Sections
                             14.3 and 16.3.

As with the two private-key cryptosystems, once again we have an alphabet of m characters.
   EXAMPLE 16.18
                             We start with two distinct primes p, q. In practice, these should be large primes
                                                                                                            — each
                             with 100 or more digits. (However, for our example we shall use much smaller primes.)
                             After selecting the primes p, g, we then consider the integers n = pg andr = (p—1)-
                             (¢q — 1) = (p)o(q) = (pq) = ¢(n), and, at this point, we choose an invertible element
                             ein Z,    =    (Zg(n))-

[Here, if the element     e is chosen   at random,    then the only time we fail to obtain an
                             invertible element is when the element chosen is a multiple of p (there are g possibilities) or
                             a multiple of g (there are p possibilities). In this count of p + g elements we have accounted
                             for pg twice, so there are only p + g — | possibilities for failure. Hence, the probability for
760   Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

failure is (p + g — 1)/(pq) = (1/q) + C1/p) — (1/(pq)), a very small number if p andg
                      each have 100 or more digits.|
                          For instance, consider p = 61, g = 127, with n = (61)(127) = 7747 and r = #(61) -
                      @ (127) = (60)(126) = 7560. Now suppose we select e as 17.
                          Consider the following message that we wish to encrypt.

INVEST IN BONDS

Using the same plaintext assignments as in part (b) of Example 14.15, here we would replace
                      the letter “I” by 08 (not merely 8). Then we replace “N” by 13. This provides us with the
                      first block of four digits— namely, 0813 —for the first two letters “IN”. The assignment
                      for the complete message is as follows [where we have appended the letter ““X”’ to the right
                      end, in order for the final block to have two letters (or, four digits)]:
                                    I    N     V     E       S       T     IT     N       BON                      D     S    X
                                   08    13    21    04      18      19   O8      13     O1      14      13        O03   18   23
                      We now encrypt each block 8 of four digits by the encryption function E, where E(B) =
                      B° mod n. (This modular exponentiation can be carried out efficiently by using the proce-
                      dure in Example 14.16.) So here the domain of E is the concatenation of Z2. with itself,
                      and we find that
                        0813!? mod 7747 = 2169                    2104!’ mod 7747 = 0628                      1819!” mod 7747 = 5540
                        0813!’ mod 7747 = 2169                    0114!’ mod 7747 = 6560                      1303!7 mod 7747 = 6401
                        1823'? mod 7747 = 4829.
                       Consequently, the recipient of the encrypted assignment (for the given plaintext message)
                       receives the ciphertext

2169    0628         5540    2169         6560      6401         4829.

Now the question is: “How does the recipient decrypt the ciphertext received?”
                          Since e is a unit in Z, (= Zgin)), we can use the Euclidean algorithm (as in Example
                       14.13) to compute e~! = d. Then we define the decryption function D, where D(C) =
                       C4 mod a, for a block C of four digits. Since e~! = d, it follows that ed = 1 mod ¢(n) —
                       that is, ed mod @(n) = 1. Therefore, ed = k(n) + 1, for some k € Z. Now recall the ar-
                       gument given earlier for the probability that a randomly selected element e from Z,, is
                       invertible (or a unit in Z,,). For any block B of four digits, we consider B as an element of
                       Z, —1n fact, we consider B as a unit in Z,,. Since the units in the ring (Z,, +, -) forma
                       group of order ¢(”) under multiplication, it follows from the result in Exercise 8 of Section
                       16.3 that B°? = BeOC)+! = (BP™)* B! = B (mod n), or B& mod n = B. [This is also a
                       consequence of Euler’s Theorem, as stated in part (b) of Exercise 13 in Section 16.3.]
                          Applying the result from the previous paragraph in our example we have p = 61, g =
                       127,n = pg =7747,r = b(n) = (p — 1)(g — 1) = (60)(126) = 7560, ande = 17. From
                       the Euclidean    algorithm    we calculate d = e~!              = 3113.    Now         we    find, for instance, that
                       21697!!3 mod 7747 = 0813 and that 06287''3 mod 7747 = 2104. Continuing, the recipient
                       determines the numeric assignment for the original plaintext and then the plaintext.

Now what makes the RSA cryptosystem more secure than the private-key cryptosystems
                       we studied? First, we should relate that the RSA cryptosystem is not a private-key cryp-
                       tosystem. This system is an example of a public-key cryptosystem, where the key (n, e) is
                       made public. So it seems that all one needs to do to decrypt the encrypted assignment is
                                                                                      16.5 Elements of Coding Theory          761

to determine d = e7! in Z, (= Zgcn)). Now it is time to realize that by knowing n we do
                            not immediately know r. For to be able to determine r = (p — 1)(q — 1), we need to know
                            p,q, the prime factors of 7. And this is what makes this system so much more secure than
                            the other cryptosystems we mentioned. Determining the primes p, g, when they are 100
                            or more digits long, is not a feasible problem. However, as computer power continues to
                            improve, to keep the RSA cryptosystem secure, one may need to redefine the key using
                            primes with more and more digits.
                               In closing, we show how the problem of factoring the modulus n as pq is related to the
                            problem of determining r = (p — 1)(q — 1). We start by observing that

p+q=pq-(p-V(q-1lt+l=n—-G(*)+1l=n-rtl,
                            while

p—q=V(p—4)
                                           = Vp — 9)? + 409 — 4g = V(p +4)?
                                                                         — 409
                                              = /(pt+gq)y—4n = J/(n-—r4
                                                                    1)? -—4n.
                            Then, from these two equations, we learn that

p=(1/2yi~+at+p-@q=0/2YIa#—-r+1I+V(n—r                                    +1)? —4n]
                            and

q = (1/2)(p +4) —(p—@))] = 1/2) —r +) -—V(in~rt le —4n].
                            Consequently, when we know n andr, then we can readily determine the primes p, g such
                            that 2 = pq.

3. Determine the plaintext for the RSA ciphertext 1418       1436
                     EXERCISES 16.4                              2370 1102 1805 0250, if e = 11 and = 2501.

The use of a computer algebra system 1s strongly recom-          4. Determine the plaintext for the RSA ciphertext 0986       3029
mended for the first four exercises.                             1134   1105   1232 2281    2967 0272     1818 2398    1153, if
                                                                 e = 17 anda = 3053.
1. Determine the ciphertext for the plaintext INVEST IN          5. Find the primes     p, qg if n = pq = 121,361      and (7) =
STOCKS, when using RSA encryption with e = 7 and 2 =             120,432.
2573.
2. Determine the ciphertext for the plaintext ORDERA PIZZA,      6. Find the primes p, g if n = pg = 5,446,367 and ¢(n) =
when using RSA encryption with e = 5 andn = 1459.                5,441,640.

16.5
          Elements of Coding Theory
                            In this and the next four sections we introduce an area of applied mathematics called
                            algebraic coding theory. This theory was inspired by the fundamental paper of Claude
                            Shannon (1948) along with results by Marcel Golay (1949) and Richard Hamming (1950).
                            Since that time it has become an area of great interest where algebraic structures, probability,
                            and combinatorics all play a role.
                               Our coverage will be held to an introductory level as we seek to model the transmission
                            of information represented by strings of the signals 0 and 1.
                               In digital communications, when information is transmitted in the form of strings of 0’s
                            and 1’s, certain problems arise. As a result of “noise” in the channel, when a certain signal
                            is transmitted a different signal may be received, thus causing the receiver to make a wrong
762          Chapter 16 Groups, Coding Theory, and Polya‘’s Method of Enumeration

decision. Hence we want to develop techniques to help us detect, and perhaps even correct,
                             transmission errors. However, we can only improve the chances of correct transmission;
                             there are no guarantees.
                                 Our model uses a binary symmetric channel, as shown in Fig. 16.2. The adjective binary
                             appears because an individual signal is represented by one of the bits 0 or 1. When a
                             transmitter sends the signal 0 or 1 in such a channel, associated with either signal is a
                             (constant) probability p for incorrect transmission. When that probability p is the same for
                             both signals, the channel is called symmetric. Here, for example, we have probability p of
                             sending 0 and having | received. The probability of sending signal 0 and having it received
                             correctly is then 1 — p. All possibilities are illustrated in Fig. 16.2.

0         1-p              0
                                                                                           e

p
                                                               Transmitted                Received
                                                                  signal              p     signal

e
                                                                     1         1-p              1

The Binary Symmetric Channel
                                                               Figure 16.2

Consider the string c = 10110. We regard c as an element of the group Z>, formed from
      EXAMPLE 16.19
                             the direct product of five copies of (Z2, +). To shorten notation we write 10110 instead of
                             (1, 0, 1, 1, 0). When sending each bit (individual signal) of c through the binary symmetric
                             channel, we assume that the probability of incorrect transmission is p = 0.05, so that the
                             probability of transmitting c with no errors is (0.95)° = 0.77.
                                 Here, and throughout our discussion of coding theory, we assume that the transmission of
                             each signal does not depend in any way on the transmissions of prior signals. Consequently,
                             the probability of the occurrence of all of these independent events (in their prescribed
                             order) is given by the product of their individual probabilities.
                                What is the probability that the party receiving the five-bit message receives the string
                             r = 00110— that is, the original message with an error in the first position? The probability
                             of incorrect transmission for the first bit is 0.05, so with the assumption of independent
                             events, (0.05)(0.95)* = 0.041 is the probability of sending c = 10110 and receiving r =
                             00110. With e = 10000, we can write c + e = r and interpret r as the result of the sum of
                             the original message c and the particular error pattern e = 10000. Since c, r, e € Z3 and
                             —]1 =1in    Z, we also have        c+r=eandr+e=c.
                                In transmitting c = 10110, the probability of receiving r = 00100 is

(0.05)(0.95)7(0.05)(0.95) = 0.002,

so this multiple error is not very likely to occur,
                                Finally if we transmit c = 10110, what is the probability that r differs from c in exactly
                             two places? To answer this we sum the probabilities for each error pattern consisting of two
                              1’s and three 0’s. Each such pattern has probability 0.002. There are (3) such patterns, so
                                                                                       16.5 Elements of Coding Theory            763

the probability of two errors in transmission is given by

(3) (0.05)*(0.95)? = 0.021.

These results lead us to the following theorem.

THEOREM 16.10   Let c € Z5. For the transmission of c through a binary symmetric channel with probability
                p of incorrect transmission,

a) the probability of receiving r = c + e, where e is a particular error pattern consisting
                       ofk 1’s and (n — k) 0’s, is p*(1 — py”.
                   b) the probability that (exactly) & errors are made in the transmission is
                                                                      ()p*a       _   py   ki

In Example 16.19, the probability of making at most one error in the transmission of
                c = 10110 is (0.95)? + (?) (0.05)(0.95)* = 0.977. Thus the chance for multiple errors in
                transmission will be considered negligible throughout the discussion in this chapter. Such
                an assumption is valid when p is small. In actuality, a binary symmetric channel is considered
                “sood” when p < 10~>. However, no matter what else we stipulate, we always want p <
                 1/2.
                    To improve the accuracy of transmission in a binary symmetric channel, certain types
                of coding schemes can be used where extra bits are provided.
                   Form, n € Z*,letn > m. Consider@ #4 W C Z3'. The set W consists of the messages to
                be transmitted. To each w € W are appendedn — m extra bits to form the code word c, where
                c € Z. This process is called encoding and is represented by the function E: W > Z)5.
                Then E(w) =c and E(W) =C CZ). Since the function E simply appends extra bits
                to the (distinct) messages, the encoding process is one-to-one. Upon transmission, c is
                received as T(c), where T(c) € Z5. Unfortunately, 7 is not a function because 7 (c) may be
                different at different transmission times (for the noise in the channel! changes with time). (See
                Fig. 16.3.)

Message w            E         Associated code       T            The received       D       The decoded
                     (an element              >|   wordc = E(w) {an           >        word T(c) (an          >     result (an
                         of Z?)                      element of Z3}                    element of 27)             element of 27)

Binary symmetric channel
                Figure 16.3

Upon receiving 7 (c), we want to apply a decoding function D: Z, — Z' to remove the
                extra bits and, we hope, obtain the original message w. Ideally D o T o F should be the
                identity function on W, with D: C >               W. Since this cannot be expected, we seek functions
                E and D such that there is a high probability of correctly decoding the received word 7 (c)
                and recapturing the original message w. In addition, we want the ratio m/n to be as large
                as possible so that an excessive number of bits are not appended to w in getting the code

"This is the binomial probability distribution that was developed in (optional) Sections 3.5 and 3.7.
764         Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

wordc = E(w). This ratio m/n measures the efficiency of our scheme and is called the rate
                            of the code. Finally, the functions £ and D should be more than theoretical results; they
                            must be practical in the sense that they can be implemented electronically.
                               In such a scheme, the functions F and D are called the encoding and decoding functions,
                            respectively, of an (n, m) block code.
                               We illustrate these ideas in the following two examples.

Consider the (m + 1, m) block code form                = 8. Let W       = Zi. For each w = w,w2---              we €
      EXAMPLE 16.20
                             W, define E: Z5 > Z3 by E(w) = w)w2--+ wgwo, where wo = 5°8_, w;, with the addi-
                            tion performed modulo 2. For example, £(11001101) = 110011011, and £(00110011) =
                            001100110.
                                For all w € ZS, E(w) contains an even number of 1’s. So for w = 11010110 and E(w) =
                            110101101, if we receive T(c) = T(E(w)) as 100101101, from the odd number of 1’s in
                            T(c) we know that a mistake has occurred in transmission. Hence we are able to detect
                            single errors in transmission. But we seem to have no way to correct such errors.
                                The probability of sending the code word 110101101 and making at most one error in
                            transmission is

(1 — p)?+ (7)p = p)®.eS

All nine bits are         One bit is changed in
                                                        correctly transmitted.    transmission and an error is detected.

For p = 0.001 this gives (0.999)? + (7)(0.001)(0.999)* = 0.99996417.
                                If we detect an error and we are able to relay a signal back to the transmitter to repeat
                            the transmission of the code word, and continue this process until the received word has an
                            even number of |’s, then the probability of sending the code word 110101101 and receiving
                            the correct transmission is approximately 0.99996393.*
                                Should an even positive number of errors occur in transmission, 7(c) is unfortunately
                            accepted as the correct code word and we interpret its first eight components as the original
                            message. This scheme is called the (m + 1, m) parity-check code and is appropriate only
                            when multiple errors are not likely to occur.
                                If we send the message 11010110 through the channel, we have probability (0.999)° =
                             0.99202794 of correct transmission. By using this parity-check code, we increase our
                             chances of getting the correct message to (approximately) 0.99996393. However, an extra
                             signal is sent (and perhaps additional transmissions are needed) and the rate of the code has
                             decreased from | to 8/9.
                                 But suppose that instead of sending eight bits we sent 160 bits, in successive strings of
                             length 8. The chances of receiving the correct message without any coding scheme would be

"For p = 0.001 the probability that an odd number of errors occurs in the transmission of the code word
                             110101101 is
                             Podd = (7)(0.999)8 (0.001) + (3)(0.999)% (0.001)? + (2)(0.999)4(0.001)> + (3) (0.999)? (0.001)7 + (2)(0.001)
                                   = 0.008928251 + 0.000000083 + 0.000000000 + 0.000000000 + 0.000000000 = 0.008928334.
                             With g = the probability of the correct transmission of 110101101    = (0.999)?, the probability that this code word
                             is transmitted and correctly received under these conditions (of retransmission) is then given by

g + Podd 4 + (Poud)?G + (Poua)?g +++ = G/(1 — Pods) = 0.99996393 (to eight decimal places).
                                                                                                 16.5 Elements of Coding Theory        765

(0.999)!6° = 0.85207557. With the parity-check method we send 180 bits, but the chances
                                  for correct transmission now increase to (0.999964)*" = 0.99928025.

The (3m, m) triple repetition code is one where we can both detect and correct single errors
    EXAMPLE 16.21
                                  in transmission. With m = 8 and W = Z®, we define E: Zs > zt by E(w) w- ++ w7ws) =
                                  W1{W2   +++    WeW)W2--+s       WeW)W2-   ++ We.

Hence if w = 10110111, thence = E(w) = 101101111011011110110111.
                                     The decoding function D: Z3* — Z8 is carried out by the majority rule. For example, if
                                  T(c) = 101001110011011110110110,                   then we have three errors occurring in positions 4,
                                  9, and 24. We decode T(c), by examining the first, ninth, and seventeenth positions to see
                                  which signal appears more times. Here it is 1 (which occurs twice), so we decode the first
                                  entry in the decoded message as |. Continuing with the entries in the second, tenth, and
                                  eighteenth positions, the result for the second entry of the decoded message is 0 (which
                                  occurs all three times). As we proceed, we recapture the correct message, 10110111.
                                     Although we have more than one transmission error here, all is well unless two (or more)
                                  errors occur with the second error eight or sixteen spaces after the first
                                                                                                          — that is, if two (or
                                  more) incorrect transmissions occur for the same bit of the original message.
                                      Now how does this scheme compare with the other methods we have? With p =
                                  0.001, the probability of correctly decoding a single bit is (0.999)? + (3) (0.001) (0.999)? =
                                  0.99999700. So the probability of receiving and correctly decoding the eight-bit message
                                  is (0.99999700)® = 0.99997600, just slightly better than the result from the parity-check
                                  method (where we may have to retransmit, thus increasing the overall transmission time).
                                  Here we transmit 24 signals for this message, so our rate is now 1/3. For this increased
                                  accuracy and the ability to detect and now correct single errors (which we could not do in
                                  any previous schemes), we may pay with an increase in transmission time. But we do not
                                  waste time with retransmissions.

(ii) 000100011; Gii) 010011111.
                          EXERCISES 16.5
                                                                               b) Find three different received words r for which D(r) =
                                                                               000.
1. Let C be a set of code words, where C C Z). In each of the
                                                                                c) For each w € Z5, what is |D~!(w)|?
                                                                                                    3

following, two of e (error pattern), r (received word) and c (code
word) are given, with r = c + e. Determine the third term.                  4. The (5m, m) five-times repetition code has encoding func-
  a) c = 1010110, r = 1011111                                               tion E: Z3' > Z3”, where E(w) = wwwww.            Decoding with
  b) ¢ = 1010110, e = 0101101                                               D: Z3”" — Z is accomplished by the majority rule. (Here we
                                                                            are able to correct single and double errors made in transmis-
  c) e = 0101111, r = 0000111
                                                                            sion.)
2. A binary symmetric channel has probability p = 0.05 of
                                                                                a) With p = 0.05, what is the probability for the transmis-
incorrect transmission. If the code word c = 011011101 is                       sion and correct decoding of the signal 0?
transmitted, what is the probability that (a) we receive r =
011111101? (b) we receive ry = 111011100? (c) a single error                   b) Answer part (a) for the message 110 in place of the sig-
occurs? (d) a double error occurs? (e) a triple error occurs?                  nal 0.
(f) three errors occur, no two of them consecutive?                             c) For m = 2, decode the received word

3. Let E: Z3 — Z? be the encoding function for the (9, 3) triple                                        r = 0111001001.
repetition code.                                                                d) If m = 2, find three received words r where D(r) = 00.
   a) If D: Z} + Z} is the corresponding decoding function,                     e) For m = 2 and D: Z}? -> Z3, what is |D~'(w)| for each
  apply   D   to decode     the   received      words   (i)   111101100;        we Z5?
766           Chapter 16 Groups, Coding Theory, and Polya‘s Method of Enumeration

The Hamming Metric
                              In this section we develop the genera! principles for discussing the error-detecting and
                              error-correcting capabilities of a coding scheme. These ideas were developed by Richard
                              Wesley Hamming (1915-1998).
                                 We start by considering a code C € Z5, where c; = O111, cp = 1111 € C. Now both
                               the transmitter and the receiver know       the elements      of C. So if the transmitter sends c;
                               but the person receiving the code word receives T(c1) as 1111, then he or she feels that
                               C2 was transmitted and makes whatever decision (a wrong one) c2 implies. Consequently,
                               although only one transmission error was made, the results could be unpleasant. Why is
                               this? Unfortunately we have two code words that are almost the same. They are rather close
                               to each other, for they differ in only one component.
                                   We describe this notion of closeness more precisely as follows.

Definition 16.9         Foreachelementx = x;x2 +++ xX, € Z5, wheren € Z*, the weight of x, denoted wt(x), is the
                               number of components x; of x, for 1 <i <n, where x; = 1. If y € Z5, the distance between
                               x and y, denoted d(x, y), is the number of components where x; # y;, for]              <i   <n.

Forn = 5, let x = 01001 and y = 11101. Then wt(x) = 2, wt(y) = 4, andd(x, y) = 2. In
      EXAMPLE 16.22
                               addition, x + y = 10100, so wt(x + y) = 2. Is it just by chance that d(x, y) = wt(x + y)?
                               For each 1 <i <5, x, + y; contributes a count of 1 to wt(x + y) <=> x; Fi <> Xi, Yi
                               contribute a count of 1 to d(x, y). [This is actually true for all n € Z*, so wt(x + y) =
                               d(x, y) forall x, y € Z5.]

When x, y € Z5, we write d(x, y) = )-7_, d(x, yi) where,

QO   ifx; =y,
                                                   foreach | <i <n,            d(Xi, Yi) =       1   ifx, L Ay ‘Te

LEMMA 16.2                     For all x, y € Z5, wt(x + y) < wt(x) + wt(y).
                               Proof: We prove this lemma by examining, foreach 1 <i           <n, the components x;, yj,.x; + yj,
                               of x, y, x + y, respectively. Only one situation would cause this inequality to be false: if
                               xX; + y; = 1 while x; =0     and y; = 0, for some     1 <i <n.        But this never occurs because
                               x; + y; = 1 implies that exactly one of x; and y; is 1.

In Example 16.22 we found that

wt(x + y) = wt(10100) = 2 <2 +4 = wt(01001) + wt(11101) = wt(x) + wt(y).

THEOREM 16.11                  The distance function d defined on Z5 X Z} satisfies the following for all x, y, z € Z5.

a) d(x, y)>0                                       b) dix, y)=O@x=y
                                 c) d(x, y) = d(y,x)                                d) d(x, z)< d(x, y)+ d(y, 2)
                                                                                          16.6 The Hamming Metric         767

Proof: We leave the first three parts for the reader and prove part (d).
                          In Z, y+y=0,           so d(x, z) =wt(x+z) = wt + (y+ y) +2) =wt(e + y)t+
                      (y + z)) < wt(x + y) + wt(y +z), by Lemma 16.2. With wt(x+ y) = d(x, y) and
                      wt(y + z) = d(y, z), the result follows. (This property is generally called the Triangle
                      Inequality.)

When a function satisfies the four properties listed in Theorem 16.11, it is called a
                      distance function or metric, and we call (Z5, d) a metric space. Hence d (as given above)
                      is often referred to as the Hamming metric. This metric is used in the following.

Definition 16.10   For n,k € Z* and x € Z5, the sphere of radius k centered at x is defined as S(x, k) =
                      {y € Z5| d(x, y) <k}.

For n =3 and x = 110€Z3,             S(x, 1) = {110, 010, 100, 111} and S(x, 2) = {110, 010,
  EXAMPLE 16.23
                      100, 111, 000, 101, O11}.

With these preliminaries in hand we turn now to the two major results of this section.

THEOREM 16.12         Let E: W — C be an encoding function with the set of messages W C Z*’ and the set of
                      code words E(W) = C C Z5, where m <n. If our objective is error detection, then for
                      k € Z*, we can detect all transmission errors of weight < & if and only if the minimum
                      distance between code words is at least k + 1.
                      Proof: The       set C is known   to both the transmitter and the receiver,         so if w € W   is the
                      message and c = E(w) is transmitted, let c # T(c) = r. If the minimum distance between
                      code words is at least kK + 1, then the transmission of c can result in as many as k errors
                      and r will not be listed in C. Hence we can detect all errors e where wt(e) < &. Conversely,
                      let c), C2 be code words with d(c;, c2) < k + 1. Then co = c; + e where wt(e) < k. If we
                      send c, and 7 (c,} = co, then we would feel that cz had been sent, thus failing to detect an
                      error of weight < k.

What can we say about error-correcting capability?

THEOREM 16.13         Let    E,   W,   and C be as in Theorem        16.12. If our objective is error correction, then for
                      k € Z*, we can construct a decoding function D: Z3 — W that corrects all transmission
                      errors of weight <x if and only if the minimum distance between code words 1s at least
                      2k +1.
                      Proof: For c € C, consider S(c, k) = {x € Zi |d(c, x) < k}. Define D: Z5 — W as follows.
                      If r € Z and r € S(c, k) for some code word c, then D(r) = w where E(w) = c. [Here
                      c is the (unique) code word nearest to r.] If r ¢ S(c, k) for any c EC, then we de-
                      fine D(r) = wo, where        wo   is some   arbitrary message       that remains   fixed once it is cho-
                      sen. The only problem we could face here is that D might not be a function. This will
                      happen      if there is an element   r in Z,     with   r in both   S(c;, k) and   S(c2, k) for distinct
                      code words c, co. But r € S(cy, kK) > d(cy, r) < k, and r € S(c2, k) > d(c2, r) <k, so
                      d(ct, €2) < d(cy,r) + d(r, c2) <k +k < 2k + 1. Consequently, if the minimum distance
                      between code words is at least 2k + 1, then D 1s a function, and it will decode all possible
768         Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

received words, correcting any transmission error of weight < k. Conversely, if cy}, c2 EC
                            and d(c;, C2) < 2k, then c2 can be obtained from c, by making at most 2k changes. Starting
                            at code word c; we make approximately half (exactly, |d(c), c2)/2]) of these changes.
                            This brings us tor = c; + e; with wt(e,) < k. Continuing from r, we make the remaining
                            changes to get to cz and find r + e2 = c2 with wt(e2) < k. But then r = c2 + e2. Now with
                            Cc) te;   =r   =co +e     and wt(e,), wt(e2) < k, how can one decide on the code word from
                            which r arises? This ambiguity results in a possible error of weight <& that cannot be
                            corrected.

With W = Z5 let E: W + ZS be given by
      EXAMPLE 16.24
                                 E00) = 000000           E(10)=     101010        E(01) = 010101      E(11)= 111111.

Then the minimum distance between code words is 3, so we can correct all single errors.
                               With
                                  $(000000, 1) = {x € Z$|d(000000, x) < 1}
                                                  = {000000,    100000, 010000, 001000, 000100, 000010, 000001},

the decoding function D: Z§ >        W gives D(x) = 00 for all x € 5(000000, 1).
                                Similarly,

§(010101, 1) = {x € Z$|d(010101, x) < 1}
                                                  = {010101, 110101, 000101, 011101, 010001, 010111, 010100},

and here D(x) = 01 for each x € $(010101, 1). At this point our definition of D accounts
                            for 14 of the elements in ZS. Continuing to define D for the 14 elements in $(101010, 1) and
                            S(111111, 1) there remain 36 other elements to account for. We define D(x) = 00 (or any
                            other message) for these 36 other elements and have a decoding function that will correct
                            single errors.

Beware! There is a subtle point that needs to be made about Theorems 16.12 and 16.13.
                            For example, if the minimum distance between code words is 2k + | one may feel that
                            we can detect all errors of weight < 2k and correct all errors of weight < k. This is not
                            necessarily true. That is, error detection and error correction need not take place at the same
                            time and at the maximum levels. To see this, reconsider the (6, 2)-triple repetition code of
                            Example 16.24. Here the encoding function E: W(= Z3) + ZS is given by E(w)w) =
                             W1W2W | W2W) Ww    and the code comprises the four elements of Zs in the range of F. Since
                            the minimum distance between any two elements of Zz; is 1, it follows that the minimum
                            distance between code words is 3 (as observed earlier in Example 16.24).
                                Now suppose that our major objective is error correction and that r = 100000 [¢ E(W)]
                            is received. We see that d(000000, r) = 1, d(101010, r) = 2, d(010101, r) = 4, and
                            d(111111, r) = 5. Consequently, we should choose to decode r as 000000, the unique
                            code word nearest tor. Unfortunately, suppose that the actual message were 10 (with corre-
                            sponding code word 101010), but we received r = 100000. Upon correcting r as 000000,
                            we should then decode 000000 to get the incorrect message 00. And, in so doing, we have
                            failed to detect an error of weight 2.
                                In this type of situation one can develop a scheme where a mixed strategy is used. Here
                            both error correction and error detection may be carried out at some levels.
                                                                       16.7 The Parity-Check and Generator Matrices                 769

For t EN,    if the received    word   is r and there          is a unique     code   word   c¢;   such   that
                    d(c\, r) <f, then we decode r as c;. (Note: The case where r = c; is covered when t = 0.)
                    If there exists a second code word c2 such that d(c2, r) = d(c, r), or if d(c, r) > t for all
                    code words c, then an error is declared (and retransmission is generally requested). Using
                    this scheme, if the minimum distance between code words is at least 2¢ + 5 + 1, fors EN,
                    then we can correct all errors of weight <¢ and detect all errors with weights between
                    t+ 1landt-+-s, inclusive.
                        When using this scheme for the (6, 2)-triple repetition code, our options include:

1) t = 0; s = 2: Here       we can detect all errors of weight            <2    but we have no error-
                             correction capability.
                          2) t = 1;s = 0: Single errors are corrected here but there is no error-detecting capability.

If we use the (10, 2)-five-times repetition code, then the minimum distance is 5. Applying
                    the above scheme in this case, our options now include:
                          1) f = 0; s = 4: Here       we can detect all errors of weight            <4    but we   have no error-
                             correction capability.
                          2) t = 1; s = 2: Now        single errors are corrected and we can also detect all errors e,
                             where 2 < wt(e) < 3.
                          3) tf = 2; s = 0: All errors of weight         <2    are corrected but there is no error-detecting
                             capability.
                    [For more on this, the interested reader should examine Chapter 4 of the text by S. Roman
                    [24].]

16.7
The Parity-Check and Generator Matrices
                    In this section we introduce an example where the encoding and decoding functions are
                    given by matrices over Z2. One of these matrices will help us to locate the nearest code
                    word for a given received word. This will be especially helpful as the set C of code words
                    grows larger.

Let
  EXAMPLE 16.25
                                                                                     1
                                                                                              ©
                                                                              —-OO
                                                                        ore
                                                                 oor

G=                         0
                                                                                              —

1
                                                                                              —_—
                                                                                         Qo

be a3 X 6 matrix over Z). The first three columns of G form the 3 X 3 identity matrix /3.
                    Letting A denote the matrix formed from the last three columns of G, we write G = [43|A]
                    to denote its structure. The (partitioned) matrix G is called a generator matrix.
                        We use G to define an encoding function E: Z3 > Z$ as follows. For w € Z3, E(w) =
                    wG is the element in ZS obtained by multiplying w, considered as a three-dimensional row
                    vector, by the matrix G on its right. Unlike the results on matrix multiplication in Chapter 7,
                    in the calculations here we have 1 + 1 = 0, not 14+ 1 = 1.
                        (Even if the set W of messages is not all of Z3, we’ ll assume that all of Z; is encoded
                    and that the transmitter and receiver will both know the real messages of importance and
                    their corresponding code words.)
770   Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

We find here, for example, that

E(110)= (110)G = [110] | 0                                                           = [110101],

—_

©


                      and

1       0 0

—_
                                                                                                          —

©
                                                                                                    _—
                                      E(010) = (010)G = [010]             |0       1 0              O 1° 14                = [010011].
                                                                          0        01               10 1

Note that £ (110) can be obtained by adding the first two rows of G, whereas E(010) is
                      simply the second row of G.
                         The set of code words obtained by this method is

C = {000000, 100110, 010011, 001101, 110101, 101011, 011110, 111000} < ZS,

and one can recapture the corresponding message by simply dropping the last three com-
                      ponents of the code word. In addition, the minimum distance between code words is 3, so
                      we can detect errors of weight < 2 or correct single errors. (We shall assume that multiple
                      errors are rare and concentrate on error correction.)
                            For all w = w, wow;   € Z,    E(w)    = w|W2W3W4Ws5
                                                                              We E ZS.                                Since

1        00               1    41       0
                                            E(w) =[wrwow3|}0            1 0 0 1~=«21
                                                                      00110                                   1
                                                   = [w,w2w3(W) + W3)(wW) + W2)(w2 + w3)],
                      we have w4 = w, + W3, W5 = w) + wo, We = W2 + wW3, and these equations are called the
                      parity-check equations. Since w; € Z. for each 1 <i <6, it follows that w; = —w; and so
                      the equations can be rewritten as

Ww]          + W3 + W4                                    =
                                                   Wy, + W2                         + Ws                      =0

W2 + W3                                  + we
                                                                                                       = 0.

Thus we find that

Wy

10110                  0                                                           0
                                           1100410                        b             =H-(E(w))"=]0},
                                           011001                              4                                              0
                                                                          Ws
                                                                          We

where (E(w))"      denotes the transpose of E(w). Consequently,                                  if r =r,ro---     re € ZS,   we
                      can identify r as a code word if and only if

0
                                                                  H-r"=10
                                                                                        0
                         Writing H = [B|J3], we notice that if the rows and columns of 8 are interchanged, then
                      we get A. Hence B = A".
                                         16.7. The Parity-Check and Generator Matrices        771

From the theory developed earlier on error correction, because the minimum distance
between the code words of this example is 3, we should be able to develop a decoding
function that corrects single errors.
    Suppose we receive r = 110110. We want to find the code word c that is the nearest
neighbor of r. If there is a long list of code words against which to check r, we would be
better off to first examine H - r“, which is called the syndrome of r. Here

|
                                 10110 07],      0
                     Her®=|1       1001 0}/9/=/1],
                                 01100144)       1
                                            0
so r is not a code word. Hence we at least detect an error. Looking back at the list of
code words, we see that d(100110,     r) = 1. For all other c € C, d(r, c) > 2. Writing r =
c +e   = 100110 + 010000, we find that the transmission error (of weight          1) occurs in the
second component of r. Is it just a coincidence that the syndrome H - r™ produced the
second column of H? If not, then we can use this result in order to realize that if a single
transmission error occurred, it took place at the second component. Changing the second
component of r, we get c; the message w comprises the first three components of c.
    Let r = c + e, where c is a code word and e is an error pattern of weight 1. Suppose that
1 is in the ith component of e, where 1 <i <6. Then

H-r“=H-(c+e"=H-(c8
                           +e") =H-c“ +H -e".
With c a code word, it follows that H -c™ = 0, so H-r" = H - e® = ith column of matrix
H. Thus c and r differ only in the ith component, and we can determine c by simply
changing the 7th component of r.
   Since we are primarily concerned with transmissions where multiple errors are rare, this
technique is of definite value. If we ask for more, however, we find ourselves expecting too
much.
   Suppose that we receive r = 000111. Computing the syndrome

0
                                 101    1         0 0        5         1
                     H-r"         1 10  0          1 90      i|>       1],
                                 0  110           0 = 1      1         1

1

we obtain a result that is not one of the columns of H. Yet H-r"™ can be obtained as
the sum of two columns from     H. If H -r™ came from the first and sixth columns           of H,
correcting these components in r results in the code word 100110. If we sum the third and
fifth columns of H to get this syndrome, upon changing the third and fifth components of
r we get a second code word, 001101. So we cannot expect H to correct multiple errors.
This is no surprise since the minimum distance between code words is 3.

We summarize the results of Example 16.25 for the general situation. For m,n € Z*
with m <n, the encoding function E: Z;' — Z} is given by an m X n matrix G over Zp.
This matrix G is called the generator matrix for the code and has the form [/,,| A], where
772            Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

Ais anm     X (n — m) matrix. Here E(w)         = wG    for each message     w € Z', and the code
                                C=     E(ZY) CZ.
                                     The associated parity-check matrix H is an (n — m) X n matrix of the form [A"| J,—m]-
                               This matrix can also be used to define the encoding function F, because 1f w = w,W2--+ Wm
                               € Z, then E(w) = wy w2- ++ Wm Wm4i            + Wa, Where Wn4t,..., Wa can be determined
                               from the set of n — m (parity-check) equations that arise from H - (E(w))" = 0, the column
                               vector of n — m 0’s.
                                   This unique parity-check matrix H also provides a decoding scheme that corrects single
                               errors in transmission if:
                                    a) H does not contain a column of 0’s. (If the ith column of H had all 0’s and H -r™ = 0
                                       for a received word r, we couldn’t decide whether r was a code word or a             received
                                       word whose ith component was incorrectly transmitted. We do not want to compare
                                       r with all code words when C is     large.)
                                    b) No two columns of H are the same. (If the th and jth columns of H are the same and
                                       H -r“ equals this repeated column, how would we decide which component of r to
                                       change?)
                                   When H satisfies these two conditions, we get the following decoding algorithm. For
                                each r € Z), if T(c) =r, then:
                                     1) With H - r™ = 0, we feel that the transmission was correct and that r is the code word
                                        that was transmitted. The decoded message then consists of the first m components
                                        of r.
                                     2) With H - r™ equal to the ith column of H, we feel that there has been a single error
                                        in transmission and change the ith component of r in order to get the code word c.
                                        Here the first 7 components of c yield the original message.
                                     3) If neither case 1 nor case 2 occurs, we feel that there has been more than one trans-
                                        mission error and we cannot provide a reliable way to decode in this situation.
                                     We close with one final comment on the matrix H. If we start with a parity-check matrix
                                H    =[B|I,—m]    and use it, as described    above,   to define the function    £, then we obtain
                                the same set of code words that is generated by the unique associated generator matrix
                                G = Un|B").

4, Let E: Z3 > Z> be an encoding function where the min-
                    943 Gh      Ae           we                       imum distance between code words is 9. What is the largest
                                                                      value of k such that we can detect errors of weight < k? If we
  1. For Example 16.24, list the elements in $(101010,     1) and     wish to correct errors of weight < n, what is the maximum value
S(11111, 1).                                                         for n?
                                                                       5. For each of the following encoding functions, find the
   2. Decode each of the following received words for Exam-
                                                                      minimum distance between the code words. Discuss the error-
ple 16.24.
                                                                      detecting and error-correcting capabilities of each code.
      a) 110101                      b) 101011
                                                                          a) E:Z3 > Z3
      ec) 001111                     d) 110000                               00 -+ 00001 =: 01 + 01010
                                                                             10> 10100      «11> 11111
  3. a) Ifx € Z,°, determine |S(x, 1)|, |S(x, 2), |S(x, 3)].
                                                                          b) E: Z5 > Z°
      b) Forn, k € Z* with 1 <k <n, ifx € Z, what is                          00 — 0000000000         01 — 0000011111
      |S(x, k)|?                                                              10 — 1111100000         11 —1111111111
                                                                        16.8 Group Codes: Decoding with Coset Leaders               773

c) E:Z3-> ZS                                                     8. Define the encoding function E: Z} —> Z$ by means of the
       000 — 000111         001    —    001001                      parity-check matrix
       010 + 010010         011    —    011100
                                                                                                   0     1    1     0
       100  > 100100

>)
                            101    —>   101010
                                                                                    H= {1           1   0    0       1
       110 > 110001         111    —    111000
                                                                                             1     0     1   0      0
   d) E:Z3;
         + Z                                                            a) Determine al] code words.
       000 — 00011111          001      — 00111010
       010 + 01010101          011      — 01110000                      b) Does this code correct all single errors in transmission?
       100 —   10001101        101      — 10101000                   9. Find the generator and parity-check matrices for the (9, 8)
       110 —   11000100        111      — 11100011                  single parity-check coding scheme of Example 16.20.
6. a) Use the parity-check matrix H of Example          16.25 to   10. a) Show thatthe 1 X 9 matrixG ={1         1   1... Iljisthe
    decode the following received words.                                generator matrix for the (9, 1) nine-times repetition code.
           i) 111101                 ii) 110101                         b) What is the associated parity-check matrix H in this
        iii) OO1111                 iv) 100100                          case?
          v) 110001                 vi) 111111
                                                                    11. For an (n, m) code C with generator matrix G = {1,,|A]
       vii) 111100                viii) 010100
                                                                    and parity-check matrix H = {A"|J,,_,,], the (n,n — m) code
   b) Are all the results in part (a) uniquely determined?          C“ with generator matrix [/,_,,|A"] and parity-check matrix
  7. The encoding function E: Z3 > Z> is given by the gener-        [A|/,] is called the dual code of C. Show that the codes in each
ator matrix                                                         of Exercises 9 and 10 constitute a pair of dual codes.
                   _f1       01          1   0                      12. Given n € Z*, let the set M(n, k) C Z5 contain the maxi-
                 C= E        10          1    |                     mum number of code words of length n, where the minimum
   a) Determine all code words. What can we say about the           distance between code words is 2k + 1. Prove that
   error-detection capability of this code? What about its error-                     n                                  an

correction capability?                                                                        <|M(n, kl < =p
                                                                                    =o (7)                          eo        (")
   b) Find the associated parity-check matrix H.                    (The upper bound on |M(n, k)| is called the Hamming bound;
    c) Use H to decode each of the following received words.        the lower bound is referred to as the Gilbert bound.)
         i) 11011       ii) 10101          iii) 11010
       iv) 00111        v) 11101           vi) 00110

16.8
                Group Codes:
         Decoding with Coset Leaders
                              Now that we’ve examined some introductory material on coding theory, it is time to see
                              how the group structure enters the picture.

Definition 16.11          Let E: Z5' + Z), for n > m, be an encoding function. The code C = E(Z%") is called a
                              group code if C is a subgroup of Z5.

Recall the encoding function E: Z5 > Z$ (of Example 16.24) where
                                     £00)    = 000000        £(10) = 101010        E(01)=010101                   = E(11)
                                                                                                                      = 111111.
                              Here Z3 and Z§ are groups under componentwise addition modulo 2; the subset C =
                              E(Z3) = {000000, 101010, 010101, 111111} is a subgroup of ZS,                        and an example   of a
                              group code. (Note that C contains 000000, the zero element of ZS.)
                                In general when the code words form a group, we find that it is easier to compute the
                              minimum distance between code words.
774         Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

THEOREM 16.14               In a group code, the minimum distance between distinct code words is the minimum of the
                            weights of the nonzero elements of the code.
                            Proof: Let a, b, c€ C where a # b, d(a, b) is minimum,                and c is nonzero with minimum
                            weight. By closure in the group C, a + b is a code word. Since d(a, b) = wt(a + b), by
                            the choice of c we have d(a, b) > wt(c). Also, wt(c) = d(c, 0), where 0 is a code word
                            because C is a group. Then d(c, 0) > d(a, 5) by the choice of a, b, so wt(c) > d(a, b).
                            Consequently, d(a, b) = wt(c).

If C   is a set of code     words   and   |C| = 1024,    we    have     to compute     (1S)   = 523,776
                            distances to find the minimum distance between code words. But if we can recognize that
                            C possesses a group structure, we need only compute the weights of the 1023 nonzero
                            elements of C.
                                Is there some way to guarantee that the code words form a group? By Theorem 16.5(d),
                            the homomorphic image of a subgroup is a subgroup, so if E: Z5' — Z} is a group homo-
                            morphism, then C = E(Z5') will be a subgroup of Z;. Our next result will use this fact to
                            show that the codes we obtain when using a generator matrix G or a parity-check matrix H
                            are group codes. Furthermore, the proof of this result reconfirms the observation we made
                            (at the end of the previous section) about the code that arises from a generator matrix G or
                            its associated parity-check matrix H.

THEOREM 16.15               Let E: Z}' > Z) be an encoding function given by a generator matrix G or the associated
                            parity-check matrix H. Then C = E(Z?')            is a group code.
                            Proof: We establish these results by proving that the function F arising from G or H isa
                             group homomorphism.
                                 Ifx, ye Zy, then E(x+y)=&%+y)G                     =xG+yG            = E(x) +   E(y). Hence E isa
                            homomorphism and C = E(Z;') is a group code [by virtue of part (d) of Theorem 16.5].
                               For the case of H, ifx is a message, then E(x) = x1X2-++XmXm41°°* Xp, Where x =
                            XjX2+++Xpm     € ZY   and   H - (E(x))" =0.       In particular,     E(x)    is uniquely      determined    by
                             these two properties. If y is also a message, then x + y is likewise, and E(x + y) has
                             (x) + yt), (¥2 + y2),.--,       (4m + Ym) as its first m components, as does E(x) + E(y). Fur-
                             ther,   H- (E(x) + E(y))" =A           -(E(x)"4+ E(y)") = A- E(x)" + H- E(y)" =04+0=
                             0. Since E(x + y) is the unique element of Z            with (x; + y1), (x2 + y2),..-.           (im + Ym)
                             as its first m components and with H - (E(x + y))" =0, it follows that E(x + y) =
                             E(x) + E(y). So E is a group homomorphism and, consequently, C = {ce € Z5| H +c"
                             = 0} is a group code.

Now we use the group structure of C, together with its cosets in Z5, to develop a scheme
                             for decoding. Our example uses the code developed in Example 16.25, but the procedure
                             applies for every group code.

We develop a table for decoding as follows.
      EXAMPLE 16.26
                                 1) First list in a row the elements of the group code C, starting with the identity.

000000      100110      010011      OO1101       110101        101011     011110       111000.

2) Next select an element x of Zz (Z,, in general) where x does not appear anywhere
                                    in the table developed so far and has minimum weight. Then list the elements of the
                                              16.8 Group Codes: Decoding with Coset Leaders              7715

coset x + C, with x + c directly below c for each c € C. For x = 100000 we have

000000     100110     O10011        001101     110101      101011      011110      111000
            100000     000110      110011       101101    010101       OO1011      111110      011000.

3) Repeat step (2) until the cosets provide a partition of ZS (Z5, in general). This results
         in the decoding table shown in Table 16.8.
      4) Once the decoding table is constructed, for each received word r we find the column
         containing r and use the first three components of the code word c at the top of the
         column to decode r.

Table 16.8 Decoding Table for the Code of Example 16.25

000000      100110       010011       001101      110101     101011        011110       111000
       100000      O00110       110011       101101      010101     001011        111110       011000
       010000      110110       Q00011       O11101      100101      111011       001110       101000
       001000      101110       011011       000101      111101     100011        010110       110000
       000100      100010       Q10111       001001      110001     101111        011010       111100
       000010      100100       010001       OO1111       110111    101001        O11100       111010
       Q00001      100111       010010       O01100      110100     101010        O11111 ~~    111001
       010100      110010       OO011!       011001      100001      111111       001010       101100

From the table we find that the code words for the received words

r; = 101001         ro = 111010           r3; = 001001          rg = 111011

are

c, = 101011        c2 = 111000           c3 = 001101            cq = 101011,

respectively. From these results the respective messages are

w, = 101           w2 = 1l1l       w3 = 001         w4
                                                                           = 101.

The entries in the first column of Table 16.8 are called the coset leaders. For the first
seven rows, the coset leaders are the same in all tables, with some permutations of rows
possible. However, for the last row, either 100001 or 001010 could have been used in place
of 010100 because they also have minimum weight 2. So the table need not be unique.
[As a result, not all double errors can be corrected because there may not be a unique code
word at a minimum distance for each r in the last coset (the one with coset leader 010100).
For example, r = 001010 has three closest code words (at distance 2) — namely, 000000,
101011, and 011110.]
      How do the coset leaders really help us? It seems that the code words in the first row are
what we used to decode r), r2, r3, and rg above.
    Consider the received words r; = 101001 and r2 = 111010 in the sixth row, where the
coset leader is x = 000010. Computing syndromes, we find that

0
                                H-(n)"=]1)=H-(o)" = Hex",
                                         0
This is not just a coincidence.
776      Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

THEOREM 16.16            Let C C Z; be a group code for a parity-check matrix H, and let r;, r2 € Z5. For the table
                        of cosets ofC in Z5, r; andr are in the same coset of C if and only if H - (r;)" = A - (72)".
                         Proof:   If r;   and r2 are in the same       coset, then r; = x +c,         and ro = x +c,        where x is
                         the coset leader, and c; and cz are the code words at the tops of the respective columns
                         for r; and r7. Then H- (r,)"= A-(xt+e))"=A-x"+H-clh =H-x"+0=H-x"
                         because c; is a code word. Likewise, H - (r2)" = H - x", sor), r2 have the same syndrome.
                         Conversely, H -(r))" = H-(n)"S                 A- (7, +1)" =05             rn, + rm isacode word c. Hence
                         ry tro    =c,sor)       =r2+candr,          €r24+C.     Since ro €r2 + C, we have r,, r> in the same
                         coset.

In decoding received words, when Table 16.8 is used we must search through 64 ele-
                         ments to find a given received word. For C © Zz                  there are 4096 strings, each with 12 bits.
                         Such a searching process is tedious, so perhaps we should be thinking about having a
                         computer do the searching. Presently it appears that this means storing the entire table:
                         6 X 64 = 384 bits of storage for Table              16.8;   12 X 4096 = 49,152       bits for C C Zz.       We
                         should like to improve this situation. Before things get better, however, they’1] look worse
                         as we enlarge Table 16.8, as shown in Table 16.9. This new table includes to the left of the
                         coset leaders (the transposes of ) the syndromes for each row.

Table 16.9 Decoding Table 16.8 with Syndromes

000      Q00000        100110      O10011         OO1101        110101    101011      011110       111000
                          110      100000        000110      110011         101101        010101    001011      111110       011000
                          011      010000        110110      O00011         011101        100101     111011     001110       101000
                          101.     901000        =101110     011011         000101        111101    100011      010110       110000
                          100      000100        100010      010111         001001        110001    101111      011010       111100
                          010      000010        100100      010001         001111        110111    101001      011100       111010
                          001      900001        100111      010010         001100        110100    101010      011111 ~~    111001
                           111     O10100        110010      OO0111         011001        100001    111111      001010       101100

Now we can decode a received word r by the following procedure.
                             1) Compute the syndrome H -r".
                             2) Find the coset leader x to the right of H -r™.
                             3) Add x to r to get c. (The code word c that we are seeking at the top of the column
                                containing r satisfies c+ x =r,orc =x +r.)

Consequently, all that is needed from Table 16.9 are the first two columns, which will
                         require (3)(8) + (6)(8) = 72 storage bits. With 18 more storage bits for H we can store
                         what we need for this decoding process, called decoding by coset leaders, in 90 storage
                         bits, as opposed to the original estimate of 384 bits.
                             Applying this procedure to r = 110110, we find the syndrome

0
                                                                        H-r"=         1
                                                                                      l

Since    011     is to the   left of the   coset   leader x = 010000,       the code   word   c =x     +r    =
                         010000 + 110110 = 100110, from which we recapture the original message, 100.
                                                                                   16.9 Hamming Matrices         777

The code here is a group code where the minimum weight of the nonzero code words
                    is 3, So we expected to be able to find a decoding scheme that corrected single errors. Here
                    this is accomplished because the error patterns of weight 1 are all coset leaders. We cannot
                    correct all double errors; only one error pattern of weight 2 is a coset leader. All error pat-
                    terns of weight 1 or 2 would have to be coset leaders before our decoding scheme could
                    correct both single and double errors in transmission.
                        Unlike the situation in Example 16.25, where syndromes were also used for decoding,
                    things here are a bit different. Once we have a complete table listing all of the cosets of C in
                    Z5, the process of decoding by coset leaders will give us an answer for all received words,
                    not just for those that are code words or have syndromes that appear among the columns of
                    the parity-check matrix H. However, we do realize that there is still a problem here because
                    the last row of our table is not unique. Nonetheless, as our last result will affirm, this method
                    provides a decoding scheme that is as good as any other.

THEOREM 16.17       When we are decoding by coset leaders, if r € Z5 is a received word and r is decoded as
                    the code word c* (which we then decode to recapture the message), then d(c*, r) < d(c, r)
                    for all code words c.
                    Proof: Let x be the coset leader for the coset containing r. Then r = c* + x, orr+c* =x,
                    so d(c*, r) = wt(r + c*) = wt(x). If ¢ is any code word, then d(c, r) = wt(c +r), and
                    we have c+r=c+(ce*+.x) = (c+c*) +x. Since C is a group code, it follows that
                    c+c*eéC andsoc +r is inthe coset x + C. Among the elements in the coset x + C, the
                    coset leader x is chosen to have minimum weight, so wt(c + r) > wt(x). Consequently,
                    d(c*, r) = wt(x) < wt(e +r) =d(c, r).

16.9
            Hamming Matrices
                    We found the parity-check matrix H helpful in correcting single errors in transmission
                    when (a) H had no column of 0’s and (b) no two columns of H were the same. For the
                    matrix
                                                     1101  1 0 0
                                                   H=/]1
                                                      01 10 1 0
                                                    0  111 00 1

we find that H satisfies these two conditions and that for the number of rows (r = 3) in H
                    we have the maximum number of columns possible. If an additional column is added, H
                    will no longer be useful for correcting single errors.
                        The generator matrix G associated with H is

10001 1 0
                                                  gu/9     190101
                                                          0010011
                                                          0001111
                    Consequently we have a (7, 4) group code. The encoding function F: Zs — Z} encodes
                    four-bit messages into seven-bit code words. We realize that because H is determined by
                    three parity-check equations, we have now maximized the number of bits we can have in
778         Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

the messages (under our present coding scheme). In addition, the columns of H, read from
                            top to bottom, are the binary equivalents of the integers from 1 to 7.
                                In general, if we start with r parity-check equations, then the parity-check matrix H
                            can have as many as 2’ — 1 columns and still be used to correct single errors. Under these
                            circumstances H     = [B| J,], where B isanr         X (2’ — 1 —r) matrix, and G = [J,| B™] with
                            m = 2' — 1 —r. The parity-check matrix H associated with a (2" — 1, 2" — 1 — r) group
                            code in this way is called a Hamming matrix, and the code is referred to as a Hamming
                            code.

If r = 4, then 2” —1 = 15 and 2’ —1—r=11.                   The one (up to a permutation of the
      EXAMPLE 16.27
                            columns) possible Hamming matrix H for r = 4 is

11  117  71 1:00   00                              1 0   0      0
                                                1 1 110001       1 10                              01    0      0
                                                11001     1071   101                               00     1     0
                                                1010    101101        1                            00    0~«1
                            Once again, the columns of H contain the binary equivalents of the integers from 1 to 15
                             (= 27 — 1).
                                This matrix H is the parity-check matrix of a Hamming                (15, 11) code whose rate is
                             11/15.

With regard to the rate of these Hamming codes, for all r > 2, the rate m/n of such a code
                            is given by m/n = (2 — 1 —r)/(2" — 1) = 1 — [r/(2” — 1)]. As r increases, r/(2" — 1)
                            goes to 0 and the rate approaches 1.

We close our discussion on coding theory with one final observation. In Section 16.7 we
                            presented G (and #7) in what is called the systematic form. Other arrangements of the rows
                            and columns of these matrices are also possible, and these yield equivalent codes. (More
                            on this can be found in the text by L. L. Dornhoff and F. E. Hohn [4].) We mention this here
                            because it is often common practice to list the columns in a Hamming matrix of r rows so
                            that the binary representations of the integers from 1 to 2" — 1 appear as the columns of H
                            are read from left to right. For the Hamming (7, 4) code, the matrix H mentioned at the
                            start of this section would take the (equivalent) form

0001  1 1
                                                            A=;}0
                                                               1 10 0 1
                                                              101010 1

Here the identity appears in the first, second, and fourth columns instead of in the last three.
                            Consequently, we would use these components for the parity checks and find that if we send
                            the message w = Ww) wW2W3w4, then the corresponding code word E(w) is c)c2W1c3W2W3W4,
                             where
                                                                 Cc)   =   w+   wo       +   wW4

C2 = Ww              + w3
                                                                                        + w4
                                                                C3 =            W2 + w3+     w4,

so that H; - (E(w))" = 0.
                                In particular, if we send the message w = w,)w2w3w4 = 1010, the corresponding code
                             word would be E(w) = c¢ = c1C2W1C3W2W3W4 = C)C21cC3010, where cy = w,; + w2+
                                                                     16.10 Counting and Equivalence: Burnside’s Theorem                   779

wa       =1+04+0=1,         mo =v,   +u3+    uw,    =14+1+0=0,           and     c3 =u.+          03+     04 =
                               0+1+0=1.Thenc = 1011010 and H, - (E(w))" = H, - (E(1010))" =
                               A, - (1011010)" = 0. (Verify this!) So ifc = 1011010    is sent but r = 1001010 is received,
                               we have H, -r“ = H, - (1001010)" = (011)". (Verify this as well!) Since 011 is the binary
                               representation for 3 we know that the error is in position 3 — and this time we did not have
                               to examine the columns of H;. So using a parity-check matrix of the form H, simplifies
                               syndrome decoding. In general, for ¢ = c)c2W | c3W2W3wW4, letr = c + e, where e is an error
                               pattern of weight 1. And suppose that the 1 in ¢ is in position i, where 1 <i <7. Then the
                               syndrome H, - r" provides the binary representation for i and we can determine c without
                               examining the columns of H,. From the third, fifth, sixth, and seventh components of c we
                               can then recapture the original message w’.

a) Encode the following messages:
                   AT eh        SCR Rm)
                                                                           1000        1100      1011         1110        1001         1111.
1. Let E: Z3 > Z,” be the encoding function fora code C. How            b) Decode the following received words:
many calculations are needed to find the minimum distance be-
                                                                             1100001          1110111         0010001         0011100.
tween code words? How many calculations are needed if E is
a group homomorphism?                                                   c) Construct a decoding table consisting of the syndromes
2. a) Use Table 16.9 to decode the following received words.            and coset leaders for this code.

000011           100011            111110         100001           d) Use the result in part (c) to decode the received words
                                                                        given in part (b).
     001100           011110            001111         111100
                                                                     5. a) What are the dimensions of the generator matrix for the
   b) Do any of the results in part (a) change if a different set       Hamming (63, 57) code? What are the dimensions for the
   of coset leaders is used?                                            associated parity-check matrix H?
3. a) Construct a decoding table (with syndromes)         for the       b) What is the rate of this code?
   group code given by the generator matrix
                                                                     6. Compare the rates of the Hamming                (7, 4) code and the
                         1011 0                                      (3, 1) triple-repetition code.
                    g=|o  onal
   b) Use the table from part (a) to decode the following re-        7. a) Let p = 0.01 be the probability of incorrect transmission
   ceived words.                                                        for a binary symmetric channel. If the message 1011 is sent
             11110      11101      11011      10100                     via the Hamming (7, 4) code, what is the probability of cor-
             10011      10101       11111    01100                      rect decoding?
   c) Does this code correct single errors in transmission?             b) Answer part (a) for a 20-bit message sent in five blocks
4. Let                                                                  of length 4.
                      1  1 0        1    1   0    O
              A=      101           1    0    1   0
                      01   1        1 40     0    1
be the parity-check matrix for a Hamming (7, 4) code.

16.10
           Counting and Equivalence:
                Burnside’s Theorem
                               In this section and the next two we shall develop a counting technique known as Polya’s
                               Method of Enumeration. Our development will not be very rigorous. Often we shall only
                               state the general results of the theory as seen in the solution of a specific problem. Our first
                               encounter with the type of problem to which this counting technique applies is presented
                               in the following example.
780           Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

We have a      set of sticks, all of the same length and color, and a second set of round plastic
      EXAMPLE 16.28
                              disks. Each disk contains two holes, as shown                  in Fig.    16.4, into which the sticks can be
                              inserted in order to form different shapes, such as a square. (See Fig. 16.5.) If each disk is
                              either red or white, how many distinct squares can we form?

>                   |
                                                                                                  D4            |        ,)
Figure 16.4                                   Cy                     G                  C3                     Cy              Cs
                                                           cf (2)

a                                                                 [|
                                                                         Cg                  Cg                 C10                 Cy
                                                                                                             cf (4)

C14                  Cis                     C16
                                     c€(5)                                                                            cf (6)

Figure 16.5

If the square is considered       stationary, then the four disks are located at four distinct
                               locations; a red or white disk is used at each location. Thus there are 2+ = 16 different
                               configurations, as shown in Fig. 16.5, where a dark circle indicates a red disk. The config-
                               urations have been split into six classes, c£(1), c£(2), ... , c€(6), according to the number
                               and relative location of the red disks.
                                   Now suppose that the square is not fixed but can be moved about in space. Unless the ver-
                               tices (disks) are marked somehow, certain configurations in Fig. 16.5 are indistinguishable
                               when we move them about.
                                   To place these notions in a more mathematical setting, we use the nonabelian group
                               of three-dimensional rigid motions of a square to define an equivalence relation on the
                               configurations in Fig. 16.5. Since this group will be used throughout this section and the
                               next two sections, we now give a detailed description of its elements.
                                   In Fig. 16.6 we have the group G = {7o, 71, 72, 73,11, 2, 73, 74} for the rigid mo-
                               tions of the square in part (a), where we have labeled the vertices with 1, 2, 3, and 4. Parts (b)
                               through (i) of the figure show how each element of G is applied. We have expressed each
                               group element as a permutation of {1, 2, 3, 4} and in a new form called a product of disjoint
                               cycles. For example, in part (b) we find 2; = (1234). The cycle (1234) indicates that if we
                               start with the square in part (a), after applying 7, we find that 1 has moved to the position
                               originally occupied by 2, 2 to that of 3, 3 to that of 4, and 4 to that of 1. In general, if xy
                               appears in a cycle, then x moves to the position originally occupied by y. Also, for a cycle
                               where x and y appear as (x ... y), y moves to the position originally occupied by x when the
                               motion described by this cycle is applied. Note that (1234) = (2341) = (3412) = (4123).
                               We say that each of these cycles has length 4, the number of elements in the cycle. In the
                               case of r; in part (f) of the figure, starting with 1 we find that r; sends 1 to 4, so we have
                                              16.10 Counting and Equivalence: Burnside’s Theorem                                                    781

1                  2                         4                            1                    3                              4

»Y                                                     -)
             4                   3                         3                            2                    2
          Starting position of                       Clockwise rotation                                   Clockwise rotation
               the square                               through 90°                                         through 180°
                                                7 = 334) = (1234)                                      T=          335) = (13)(24)
  (a)                                   (b)                                                     (c)

2                   3                         1                            2                    4                              3

1                  4                         4                            3                    1                              2
           Clockwise rotation                     Clockwise rotation                                  Reflection in the horizontal
                  through 270°                                 through 360°
        m3 = (1234)
              4123
                    = (1432)                  mq = (1234)
                                                    1234  = (1)(2)(3)(4)                               r, = (1234)
                                                                                                             4321  = (14)(23)
  (d)                                  (e)                                                      op)

2!                  1                         3                            /                    .                              4
                        |                                                           7       2                TT.
                        |                                                       7                                    NX

$                                                s                                                      *
                                                                            7                                             XN

|                                         Yo                                                               \,
                                                      Al                                                                                N   3

3                   4                     é                                1                    2                              ‘
        Reflection in the vertical            Reflection in the diagonal                              Reflection in the diagonal
                                              through vertices 2 and 4                                through vertices 1 and 3
        ry == (1234)  = (12)(34)
               (1234) _                          —             (1234)       _
                                                                                                       ly — (5739?
                                                                                                             (1234) _
                                                                                                                      (1}(24)(3)
(g)                                    (h)                                                     (i)
Figure 16.6

(14...) as the start of our first cycle in this decomposition of r;. However, here r; sends
4 to 1, so we have completed a portion— namely, (14)        — of the complete decomposition.
We then select a vertex that has not yet appeared— for example, vertex 2. Since r; sends
2 to 3 and 3, in turn, to 2, we get a second cycle (23). This exhausts all vertices and so
(14)(23) = r1, where the cycles (14) and (23) have no vertex in common. Here (14)(23) =
(23)(14) = (23)(41) = (32)(41) all provide a representation of r; as a product of disjoint
cycles, each of length 2. Last, for the group element r3 = (13)(2)(4), the cycle (2) indicates
that 2 is fixed, or invariant, under the permutation r3. When the number of vertices involved
is known, the permutation r3 may also be written as r; = (13), where the missing elements
are understood to be fixed. However, we shall write all of the cycles in our decompositions,
for this will be useful later in our discussion.
    Before continuing with the main discussion concerning the disks and sticks, let us ex-
amine some further results on disjoint cycles.
                                    .      .                       _f{1    2 3 4 5 6
    Inthe group Sof all permutations of (1, 2,3, 4,5, 6) let                                    =      ( 5       3             14               6   s )

As a product of disjoint cycles,

mw = (123)(4)(56) = (56)(4)(123) = (4)(231)(65).
782   Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

.            _f1        2     3        4        5        6
                          Io        © Ss, with o            = (        45             1        6        5 ) then
                                                                  2

123 4 5 6\/1 23                                                 45      6
                                         = (124)(356) =
                                        o=(12935)=()                           7 312   lias                                                  ae      a)
                       so each cycle can be thought of as an element of So.
                          Finally, ifa = (124)(3)(56) and B = (13)(245)(6) are elements of S¢, then

ap    = (124)(3)(56)(13)
                                                                               (245) (6) = (143) (256),

whereas

Boa = (13)(245)(6)
                                                                        (124) (3) (56) = (132) (465).

Returning to the 16 configurations, or colorings, in Fig. 16.5, we now examine how
                       each element in the group G, in Fig. 16.6, acts upon these configurations. For example,
                                                        4                                                                                .                            ;
                           = ( P23            ) permutes the numbers {1, 2, 3, 4} according to a 90° clockwise
                                23      4   1
                       rotation for the square in Fig. 16.6(a), yielding the result in Fig. 16.6(b). How does such
                       a rotation act on S$ = {C;, C2,...,                            Cie}, our set of colorings? We                         use 1        to distinguish
                       between the 90° clockwise rotation for {1, 2, 3, 4} and the same rotation when applied to
                       S={C,,         Co,...,           Ci6}. We       find that

nt        (C        Cr        Cx   Ca    Cs       Ce       C7       Cg       Co   Cio   Cr    Ci2   Ci3        Cia   Cis     on)
                                          C,       C3        C4   Cs    Co       C7       Cg       Co       Co   Cur   Cro   C13   Cia        Cis   Cir     Cio)

As a product of disjoint cycles,

my = (C))(CxC3CaCs)
                                                 (CoC7CgCo) (CoC 11 (C12C13C 1415) (C6).
                       We note that under the action of mi, no configuration is changed into one that is in another
                       class.
                           As a second example, consider the reflection r3 in Fig. 16.6(h). The action of this rigid
                       motion on S is given by

——_ (¢             C2 C3 Cy Cs Cg C7 Cg Co Cro Cri                                           Ci2 Ci3 Cia Cis on)
                                3         C,       Cy        Cs   Cy   C3    C7           Co       Co       Cg   Cro   Cir   Cig   Ci3       Ci2    Cis     Cie
                                     = (C))(C2)(C3C5) (C4) (CoC7) (Cg Co) (Cio) (Ci (C12 14) (C13) (C15) (Cie).

Once again, no configuration is taken by r} into one that is outside the class that it was in
                       originally.
                           Using the idea of the group G acting on the set S, we define a relation ® on S as follows.
                       For colorings C;, C; € S, where 1 <i, j < 16, we write C; % C; if there is a permutation
                       o €G such that o*(C;) = C;. That is, as o* acts on the 16 configurations in S, C; is
                       transformed into C;. This relation & is an equivalence relation, as we now verify.

a) (Reflexive Property) For all C; € S, where 1 <i                                            < 16, it follows that C, R C; because
                               G contains the identity permutation. [23 (C,) = C; for all 1 <i < 16.]
                         b) (Symmetric Property) If C; & C, for C;, C, € S, then o*(C;) = C,, for some a € G.
                               G is a group, so a7! € G, and we find that (o*)~' = (0 ~')*. (Verify this for two
                               choices of ¢ € G.) Hence C, = (a ~')* (Cj), and C; RC.
                                                     16.10 Counting and Equivalence: Burnside’s Theorem    783

c) (Transitive Property) Let C,,C,, C, € S with C; RC; and Cj; RC,. Then Cj =
                       o*(C;) and C, = t*(C;), for some o, t € G. By closure in G, ot € G, and we find
                       that (ot)* = a*t*, where a is applied first inot and o® first ino*t*. (Verify this for
                       two specific permutations o, t € G.) Then C, = (ot)*(C;) and & is transitive. [The
                       reader may have noticed that C, = t*(C;) = t*(o *(C;)) and felt that we should have
                       written (ot)*    = t*o*.   Once again, there has been a change in the notation for the
                       composite function as we first defined it in Chapter 5. Here we write o*t* for (ot)*,
                       and o” is applied first.]
                     Since & is an equivalence relation on S, & partitions S into equivalence classes, which
                  are precisely the classes cé(1), c€(2), ..., c&(6) of Fig. 16.5. Consequently, there are six
                  nonequivalent configurations under the group action. So among the original 16 colorings
                  only 6 are really distinct.
                      What has happened in this example generalizes as follows. With S a set of configurations,
                  let G be a group (of permutations) that acts on S. If the relation & is defined on S by x R y
                  if 7*(x) = y, for some 7 € G, then &% is an equivalence relation.

With only red and white disks to connect the sticks, the answer to this example could
                  have been determined from the results in Fig. 16.5. However, we developed quite a bit of
                  mathematical overkill to answer the question. Referring to S as the set of 2-colorings of
                  the vertices of a square, we start to wonder about the role of 2 and seek the number of
                  nonequivalent configurations if the disks come in three or more colors.
                     In addition, we might notice that the function f(r, w) = r+ + rw + 2r?w? 4+ rw? 4+ wt
                  is the generating function (of two variables) for the number of nonequivalent configura-
                  tions from S. Here the coefficient of r'w*~', for 0 <i <4, yields the number of distinct
                  2-colorings that have i red disks and (4 — 7) white ones. The coefficient of r7w* is 2 be-
                  cause of the two equivalence classes c£(3) and c€(4). Finally, f(1, 1) = 6, the number of
                  equivalence classes. This generating function f(r, w) is called the pattern inventory for the
                  configurations. We shall examine it in more detail in the next two sections.

For now we record an extended version of our present results in the following theorem.
                  (A proof of this result is given on pages 136-137 of C. L. Liu [17].)

THEOREM 16.18     Burnside’s Theorem. Let S be a set of configurations on which a finite group G of permu-
                  tations acts. The number of equivalence classes into which S is partitioned by the action of
                  G is then given by

ia Y> vir*),
                                                           |            *

wEG

where 1 (z*) is the number of configurations in S fixed under z*.

To better accept the validity of this theorem, we first examine two examples where we
                  already know the answers.

In Example   16.28 we find that Ww (7)   = 2 because only C; and Cj¢ are fixed, or invariant,
| EXAMPLE 16.29   under 77;*. For r3 € G, however, y(r;) = 8 because Cy, C2, Ca, Cio, Cri, C13, Cis, and Cie
                  remain fixed under this group action. In like manner w (s73') = 4, w(x) = 2, w(ag) = 16,
784           Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

wy) = wry) = 4, and w(rf) = 8. With |G| = 8, Burnside’s Theorem implies that the
                              number of equivalence classes, or nonequivalent configurations, is

(1/8)16+2+44+24+444+48
                                                                    + 8) = (1/8)(48) =6,
                              the original answer.

In how many ways can six people be arranged around a circular table if two arrangements
      EXAMPLE 16.30
                              are considered equivalent when one can be obtained from the other by means of a clockwise
                              rotation through i - 60°, forO <i <5?
                                  Here the six distinct people are to be placed in six chairs located at a table, as shown in
                              Fig. 16.7. Our permutation group G consists of the clockwise rotations 7, through i - 60°,
6                 2           where 0 <i < 5. Here reflections are not meaningful. The situation is two-dimensional, for
                              we can rotate the circle (representing the table) only in the plane; the circle never lifts off
                              the plane. The total number of possible configurations is 6! We find that y(zj) = 6! and
5                 3           that y(27*) = 0, for 1 <i <5. (It’s impossible to move different people and simultaneously
                              have them stay in a fixed location.)
         4
                                  Consequently, the total number of nonequivalent seating arrangements is
Figure 16.7
                                                (=)       > wo*) = (z)          (6'+0+0+0+0+0)=5!,
                                                         aeG

as we found in Example 1.16 of Chapter 1.

We now examine a situation where the power of this theorem is made apparent.

In how many ways can the vertices of a square be 3-colored, if the square can be moved
      EXAMPLE 16.31
                               about in three dimensions?
                                  Now   we have the sticks of Example       16.28, along with red, white, and blue disks. Con-
                               sidering the group in Fig. 16.6, we find the following:
                                  w (ag) = 34, because      the identity fixes all 81 configurations     in the set S of possible
                                  configurations.
                                     (2*)7 =    W(2x*)
                                                    3 = 3, foreach of z*,1 > 2*783 leaves invariant only y those configurations
                                                                                                                      g         with
                                  all vertices the same color.
                                  w (>) = 9, for 2 can fix only those configurations where the opposite (diagonally)
                                  vertices have the same color. Consider a square like the one shown in Fig. 16.8. There
                                  are three choices for placing a colored disk at vertex 1 and then one choice for matching
                                  jt at vertex 3. Likewise, there are three choices for colors at vertex 2 and then one for
                                  vertex 4. Consequently, there are nine configurations invariant under 7}.
                                  wrt) = w(r}) = 9. In the case of r*, for the square shown in Fig. 16.8 we have three
                                  choices for coloring each of the vertices | and 2, and then we must match the color of
4             3                   vertex 4 with the color of vertex 1, and the color of vertex 3 with that of vertex 2.
Figure 16.8                       Finally, w (rz j=   wry ) = 27. For rz , we have nine choices for coloring the two vertices
                                  at 2 and 4, and three choices for vertex |. Then there is only one choice for vertex 3
                                  because we must match the color of vertex |.

By Burnside’s Theorem, the number of nonequivalent configurations is

(1/8)(34 +3437 +3437 + 37 +3? +33) =21.
                                                                                                   16.11 The Cycle Index           785

a) How many distinct paintings can be made if there are
                        EXERCISES 16.10                                  three colors of paint available? How many for four colors?
                                                                         b) Answer part (a) for batons with four cylindrical bands.
1. Consider the configurations shown in Fig. 16.5.
                                                                         c) Answer part (a) for batons with n cylindrical bands.
    a) Determine 73*, 773°, r*, and r>.
                                                                         d) Answer parts (a) and (b) if adjacent cylindrical bands
   b) Verify that (7')* = (jt)! and (ry ')* = OF)!
                                                                         are to have different colors.
    c) Verify that (yry)* = aftr and (rarq)* = ayrf.
                                                                      9, In how many ways can we 2-color the vertices of the con-
  2. Express each of the following elements of S; as a product       figurations shown in Fig. 16.9 if they are free to move in (a) two
of disjoint cycles.                                                  dimensions? (b) three dimensions?
                  —f1234567
                ““\o 467153
                p-(1 2345  6 7
                    3652174
                    (1234567
                Y“\o 3-175 4 6                                                      Figure 16.9
                5-(1 234567
                                                                     10. A pyramid has a square base and four faces that are equi-
                    4271365
                                                                     lateral triangles. If we can move the pyramid about (in three
3. a) Determine the order of each of the elements in Exer-
                                                                     dimensions), how many nonequivalent ways are there to paint
    cise 2,
                                                                     its five faces if we have paint of four different colors? How
    b) State a general result about the order of an element in       many if the color of the base must be different from the color(s)
    S, in terms of the lengths of the cycles in its decomposition    of the triangular faces?
    as a product of disjoint cycles.
                                                                     11. a) In how many ways can we paint the cells of a 3 x 3
4, a) Determine the number of distinct ways one can color the
                                                                         chessboard using red and blue paint? (The back of the chess-
    vertices of an equilateral triangle using the colors red and
                                                                         board is black.)
    white, if the triangle is free to move in three dimensions.
                                                                         b) In how many ways can we construct a 3 X 3 chess-
    b) Answer part (a) if the color blue is also available.
                                                                         board by joining (with paste) the edges of nine 1 X 1 plastic
5. Answer the questions in Exercise 4 for a regular pentagon.           squares that are transparent and tinted red or blue? (There
6. a) How many distinct ways are there to paint the edges of            are nine squares of each color available.)
    a square with three different colors?                            12. Answer Exercise 11 fora 4 X 4 chessboard. [Replace each
    b) Answer part (a) for the edges of a regular pentagon.          “nine” in part (b) with “sixteen.”’]
  7, We make a child’s bracelet by symmetrically placing four        13. In how many ways can we paint the seven (identical) horses
beads about a circular wire. The colors of the beads are red,        on a carousel using black, brown, and white paint?
white, blue, and green, and there are at least four beads of each
                                                                     14, a) Let S bea set of configurations and G a group of permu-
color. (a) How many distinct bracelets can we make in this
                                                                         tations that acts on S. Ifx € S, prove that {7 € G|x*(x) =
way, if the bracelets can be rotated but not reflected? (b) Answer
                                                                         x} is a subgroup of G (called the stabilizer of x).
part (a) if the bracelets can be rotated and reflected.
                                                                         b) Determine the respective stabilizer subgroups in part (a)
8. A baton is painted with three cylindrical bands of color (not
                                                                         for each of the configurations C7 and Cs in Fig. 16.5.
necessarily distinct), with each band of the same length.

16.11
                   The Cycle Index
                               In applying Burnside’s Theorem we have been faced with computing y(*) for each
                               x € G, where G is a permutation group acting on a set S of configurations. As the number
                               of available colors increases and the configurations get more complex, such computations
                               can get a bit involved. In addition, it seems that if we can determine the number of 2-
                               colorings for a set S of configurations, we should be able to use some of the work in this
                               case to determine the number of 3-colorings, 4-colorings, and so on. We                 shall now   find
786    Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

some assistance as we return to the solution of Example 16.28. This time more attention
                         will be paid to the representation of each permutation z € G as a product of disjoint cycles.
                         Our results are summarized in Table 16.10.

Table 16.10

Cycle
                                                     Structure
                                  Configurations     Represen-
       Rigid Motions z             in S that Are       tation                  Inventory of Configurations that Are
       (Elements of G) | Invariant under 2*            of zx                           Invariant under x*
       mo = (1)(2)(3)(4) | 2*: All configurations       xt       (r+w)                  =ri44w       4 6r?2w?4+4rw?   + wt
                                  in S$

m1 = (1234)          2:C1, Cis                    X4      r++ w4                 = 74                          + w*

m, = (13)(24)        27: C1, Cw, Cur, Cie        x3        (r? + w?)?            =r4          + 2r?y?          + wt

3 = (1432)           2:C1, Cie                    x4      rét+w4                 =r!                           + y*

r) = (14)(23)        2?.   Cy), C7, Co, Cre      x5        (r? + wy?             =r           + 2r2w?          + yy

ro = (12)(34)        27: C1, Co, Cg, Cre         x5        (r* + w?)?            =/4          + 2r?y?          + wy*

rs = (13)(2)(4)      23: Cy, Co, Ca, Cyo,       XoXT       (re+wi)(rt+ wy        =rt+    2rw st 2r2w? + 2ru? + wt
                                  Ci, Cr, Cis, Cte

rg = (1)(24)(3)      23: Cy, C3, Cs, Cio,       XX?        (24+ w%7¢4+ wy? =r4*4         Iw   t 2r?2w? + 2rw? + w4
                                  Ci, Ci, Cia, Cre
                            Pg (X1, X2, X3, X4) =                    Complete       = 8r°4 + 8rew
                                                                                               3 + lOr-we
                                                                                                       2,2 + 8rw° 3 + Bw 4
                                   g(X] + 2xq + 3xz + 2x9x7)         Inventory

For zo, the identity of G, we write 7 = (1)(2)(3)(4), a product of four disjoint cycles.
                         We shall represent this cycle structure algebraically by x}, where x; indicates a cycle of
                         length 1. The term x} is called the cycle structure representation of mo. Here we interpret
                         “disjoint” as “independent,” in the sense that whatever color is used to paint the vertices in
                         one cycle has no bearing on the choice of color for the vertices in another cycle. As long
                         as all the vertices in a given cycle have the same color, we shall find configurations that
                         are invariant under 7. (Admittedly, this seems like mathematical overkill again, inasmuch
                         as mq fixes all 2-colorings of the square.) In addition, since we can paint the vertices in
                         each cycle either red or white, we have 2* configurations, and we find that (r + w)* =
                         r444r3w + 6r?w? + 4rw? + w* generates these 16 configurations. For example, from
                         the term 6r w? we find that there are six configurations with two red and two white vertices,
                         as found in classes c£(3) and c£(4) of Fig. 16.5.
                            Turning to z,, we find 2; = (1234), acycle of length 4. This cycle structure is represented
                         by x4, and here there are only two invariant configurations. The fact that the cycle structure
                         for x; has only one cycle tells us that for a configuration to be invariant under *, every
                         vertex in this cycle must be painted the same color. With two colors to choose from, there
                         are only two possible configurations, C and Cj¢. In this case the term r+ + w* generates
                         these configurations.
                             Continuing with r;, we have r; = (14)(23), a product of two disjoint cycles of length
                         2; the term x5 represents this cycle structure. For a configuration to be invariant under r**,
                         the vertices at 2 and 3 must be the same color; that is, we have two choices for coloring the
                                                                                          16.11 The Cycle Index     787

vertices in (23). We also have two choices for coloring the vertices in (14). Consequently, we
                  get 2? invariant configurations: C,(r*), C7(r?w’), Co(r?w), and Cye(w*). [(r? + w?)? =
                  r+ 2r?w? + wt]
                      Finally, in the case of rz; = (13)(2)(4), we find that xQXxP indicates its decomposition into
                  one cycle of length 2 and two of length 1. The vertices at 1 and 3 must be painted the same
                  color if the configuration is to be invariant under rj. With three cycles and two choices
                  of color for each cycle, we find 2? invariant configurations. They are C,(r*), C2(r?w),
                  Cy(r3w), Cio(r2w?), Ci (r?w?), Ci3(rw), C1s(rw>), and C)6(w*). These configurations
                  are generated by (r? + w’)(r + w)’, for when we consider the cycle (13) we have two
                  choices: both vertices red (r2) or both vertices white (w”). This gives us r* + w*. For
                  each single vertex in the two cycles of length 1, r + w provides the choices for each cycle,
                  (r + w)* the choices for the two. By the independence of choice of colors as we go from
                  one cycle to another, (r? + w?)(r + w)? generates the 2° configurations that are invariant
                  under rj.
                     Similar arguments provide the information in Table               16.10 for the permutations 72, 73,
                  ro, and   r 4.

At this point we see that what determines the number of configurations that are invariant
                  under x*, for x € G, depends on the cycle structure of 2. Within each cycle the same color
                  must be used, but that color can be selected from the two or more choices made available.
                  For 7}, we had two cycles (of length 2) and 2? configurations. If three colors had been
                  available, the number       of invariant configurations would have been 3°. For m colors, the
                  number is m*. Adding these terms for all the cycle structures that arise gives } >. -g W(a").
                     We now wish to place more emphasis on cycle structures, so we define the cycle index,
                  Pg, for the group G (of permutations) as
                                                              I
                                   Po (%1, x2, X3, X4) = iG]      >     (cycle structure representation of 7).
                                                                  xEG

In this example,

Po (x1, X2, X3, X4) = (1/8) (xp + 2x4 + 3x5 + 2x0x7).
                  When      each occurrence of x1, x2, x3, x4 is replaced by 2, we find that the number of non-
                  equivalent 2-colorings is equal to

Po (2, 2, 2, 2) = (1/8)(24 + 2(2) + 3(27) + 2(2)(2”)) = 6.
                     We summarize our present findings in the following result.

THEOREM 16.19     Let S be a set of configurations that are acted upon by a permutation group G. [G is a
                  subgroup of S,, the group of all permutations of {1, 2, 3,...,n}, and the cycle index
                  Po (x1, X2, X3,.-.,X%,) of Gis

(1/|G])   ys     (cycle structure representation of 7r).]
                                                     xzEG

The number of nonequivalent m-colorings of Sis then Pg(m, m, m,..., m).

We close this section with an example that uses this theorem.

In how many distinct ways can we 4-color the vertices of a regular hexagon that is free to
  EXAMPLE 16.32
                  move in space?
788            Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

For a regular hexagon there are twelve rigid motions: (a) the six clockwise rotations
                               through 0°, 60°, 120°, 180°, 240°, and 300°; (b) the three reflections in diagonals through
                               opposite vertices; and (c) the three reflections about lines passing through the midpoints of
                               opposite edges.

(1) (1)(2)(3)(4)(5)(6)_ x8                                      (7)   (1)(26)(35)(4)   x?x2
                                      (2) (123456)            X      6                        2       (8)   (13)(46)(2)(5)—xtx3
                                      (3) (135)(246)          x2                                      (9)   (15)(24)(3)(@)   x}x3
                                      (4)   (14)(25)(36)      x3                                     (10)   (12)(36)(45)       x3
                                      (5)   (153)(264)        x?      5                       3      (11)   (14)(23)(56)       x3
                                      (6) (165432)            X<                                     (12)   (16)(25)(34)       x3

Figure 16.10

In Fig. 16.10 we have listed each group element as a product of disjoint cycles, together
                                with its cycle structure representation. Here

PG (X1, X2, X3, X4, 5, X6) = (1/12)(xP + 2x6 + 2x5 + 4x3 + 3x75),
                                and there are

Pg (4, 4, 4, 4, 4, 4) = (1/12)(4° + 2(4) + 2(4’) + 404%) + 3(47)(4°)) = 430
                                nonequivalent 4-colorings of a regular hexagon. (Note: Even though neither x4 nor x5 occurs
                                in acycle structure representation, we may list these variables among the arguments of Pg.)

4. a) Inhow many ways can we 3-color the vertices of a regular
                        EXERCISES 16.11                                  hexagon that is free to move in space?

1. In how many ways can we 5-color the vertices of a square               b) Give a combinatorial argument to show that for all m €
that is free to move in (a) two dimensions? (b) three dimensions?         Z*, (m® + 2m + 2m? + 4m? + 3m‘) is divisible by 12.
                                                                      5. a) Inhow many ways can we 5-color the vertices of a regular
2. Answer Exercise | for a regular pentagon.                             hexagon that is free to move in two dimensions?
                                                                          b) Answer part (a) if the hexagon is free to move in three
3. Find the number of nonequivalent 4-colorings of the vertices
                                                                          dimensions.
in the configurations shown in Fig. 16.11 when they are free to
move in (a) two dimensions; (b) three dimensions.                         c) Find two 5-colorings that are equivalent for case (b) but
                                                                          distinct for case (a).
                                                                      6. In how many distinct ways can we 3-color the edges in the
                                                                      configurations shown in Fig. 16.11 if they are free to move in
                                                                      (a) two dimensions; (b) three dimensions?
                                                                      7. a) In how many distinct ways can we 3-color the edges of a
                                                                         square that is free to move in three dimensions?
                                                                          b) In how many distinct ways can we 3-color both the ver-
                                                                          tices and the edges of such a square?
                                                                          c) For a square that can move in three dimensions, let k,
                                                                          m, and n denote the number of distinct ways in which we
                                                                          can 3-color its vertices (alone), its edges (alone), and both
                                                                          its vertices and edges, respectively. Does n = km? (Give a
      Figure 16.11                                                        geometric explanation.)
                                                    16.12 The Pattern Inventory: Polya’s Method of Enumeration              789

16.12
          The Pattern Inventory:
     Polya’s Method of Enumeration
                    In this final section we return to Example 16.28 and its continued analysis in Section 16.11.
                    At this time we introduce the pattern inventory and how it is derived from the cycle index.
                        For zo € G, every configuration in S is invariant. The cycle structure (representation)
                    for 79 is given by x}, where for each cycle of length | we have a choice of coloring the
                    vertex in that cycle red (r) or white (w). Using + to represent exclusive or, we write r + w
                    to denote the two choices for that vertex (cycle of length 1). With four such cycles, (r + w)4
                    generates the patterns of the 16 configurations.
                         In the case of 7; = (1234), x4 denotes the cycle structure, and here all four vertices must
                    be the same color for the configuration to remain fixed under z*. Consequently, we have
                    all four vertices red or all four vertices white, and we express this algebraically by r+ + w*.
                        At this point we notice that for each of the permutations we have considered, the number
                    of factors in the expression used to generate the patterns fixed under a certain permutation
                    equals the number of factors in the cycle structure (representation) of that permutation. Is
                    this just a coincidence?
                        Continue now with r; = (14)(23), whose cycle structure is x5. For the cycle (14) we
                    must color both of the vertices | and 4 either red or white. These choices are represented by
                    r* + w’. Since there are two such cycles of length 2, we find that (r* + w?)? will generate
                    the patterns of the configurations in S fixed under r}*. Once again the number of factors in
                    the cycle structure equals the number of factors in the corresponding term used to generate
                    the patterns.
                         Last, for r; = (13)(2)(4), the cycle structure is xox? = x? xD.       For each of the cycles (2)
                    and (4), r + w represents the choices for each of these vertices, so that (r + w)? accounts
                    for all four colorings of the pair. The cycle (13) indicates that vertices 1 and 3 must have
                    the same color; r? + w? accounts for the two possibilities. Therefore, (r + w)*(r7 + w)
                    generates the patterns of the configurations in S fixed under rj’, and we find three factors in
                    both the cycle structure and the product (r + w)?(r? + w?). But even more comes to light
                    here.
                         Looking at the terms in the cycle structures, we see that, for | <i          <n, the factor x, in
                    the cycle structure corresponds with the term r' + w’ in the expression used to generate the
                    patterns.
                         Continuing   with the cycle structures     for 2,    73, r2, and r4, we    find that the pattern
                    inventory    can be obtained    by replacing each x,      in Pg(x), x2, x3, x4) with         r' + w',    for
                    1 <i <4, Consequently,

Po(r+w,     rr    w,   rP+tuwiyrtt+      w*)    =r4trwt2rrw?+ru2                t+ wt.

(This result is (1/8)-th of the complete inventory listed in Table 16.10.)
                         If we had three colors (red, white, and blue), the replacement for x; would be r? + w! +
                    b', where 1 <i <4.
                        We generalize these observations in the following theorem.

THEOREM 16.20       Polya’s Method of Enumeration. Let S be a set of configurations that are acted upon by a per-
                    mutation group G, where G is a subgroup of S, and G has cycle index Pg (x), x2, .-., Xn).
790            Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

Then the pattern inventory of nonequivalent m-colorings of S is given by

where c}, C2, ..., Cm denote the m colors that are available.

One important point should be reiterated here before applying Theorem 16.20 — namely,
                               the pattern inventory is another example of a generating function. Having made that point,
                               we now apply this theorem in the following examples.

A child’s bracelet is formed by placing three beads — red, white, and blue — on a circular
      EXAMPLE 16.33
                               piece of wire. Bracelets are considered equivalent if one can be obtained from the other by
                               a (planar) rotation. Find the pattern inventory for these bracelets.
                                  Here G is the group of rotations of an equilateral triangle, so G = {(1)(2)(3), (123),
                               (132)}, where     1, 2, 3 denote the vertices of the triangle. Then    Pg (x1, x2, x3) = (1/3)      -
                               (x} -+ 2x3), and the pattern inventory
                                                                    is given by (1/3) [(r + w + 6)? + 20° + w3 + b3)] =
                               (1/3)[3r3 + 3r2w + 3r7b + 3rw? + 6rwb + 3rb* + 3w + 3w2b + 3wh? + 363] =
                               P+rwt+ Pb+rw? + 2rwbht+rb?+w> + wb + wh? +b. We interpret this result
                               as follows:

1) For each summand, other than 2rwd, the coefficient is 1 because there is only one
                                      (distinct) bracelet of that type. That is, there is one bracelet with three red beads (for
                                      r3), one with two red beads and one white bead (for r*w), and so on for the other
                                      seven summands with coefficient 1.
                                   2) The summand 2rwb has coefficient 2 because there are two nonequivalent bracelets
                                      with one red, one white, and one blue bead
                                                                               — as shown in Fig. 16.12.
                                   If the bracelets can also be reflected, then G becomes {(1)(2)(3), (123), (132), (1)(23),
                               (2)(13), (3)(12)}, and the pattern inventory here is the same as the one above, with one
                               exception. Here we have rwb, instead of 2r wb, because the nonequivalent (for rotations)
Figure 16.12                   patterns in Fig. 16.12 become equivalent when reflections are allowed.

Consider the 3-colorings of the configurations in Example 16.28. If the three colors are red,
      EXAMPLE 16.34
                               white, and blue, how many nonequivalent configurations have exactly two red vertices?
                                  Given that Pg (x1, x2, x3, 4) = (1/8) (xt + 2x4 + 3x3 + 2x2x7), the answer is the sum
                               of the coefficients of r?w?, r7b?, and r*wb in (1/8)[(r + w + b)* + 2(r4 + wt + bY) +
                               3(r? + ww? + b*)? + 2(r? + ww? + b*)(r + w + b)?).
                                  In (7 + w +5)‘, we find the term 6r7w? + 6r2b? + 1272 wh. For 3(r? + w? + b?)?,
                               we are interested in the term 6r?w* + 6r7b?, whereas 4r?w? + 4r2b? + 4r2bw arises in
                               2072+ uw? +b) (r+w+b).
                                  Then (1/8)[6r?w? + 6r7b? + 12r2wh + 6r2w? + 6r?b? + 4r2w? + 4r2b? + 4r2bw] =
                               2r*w* + 2r*b? + 2r7bw, the inventory of the six nonequivalent confi gurations that contain
                               exactly two red vertices.

Our next example deals with the pattern inventory for the 2-colorings of the vertices of
                               a cube. (The colors are red and white.)
                                                                             16.12 The Pattern Inventory: Polya’s Method of Enumeration                                                   791

EXAMPLE   16.35    For the cube in Fig.                              16.13, we find that its group G of rigid motions consists of the following

1) The identity transformation with cycle structure x?.
                              2) Rotations through 90°, 180°, and 270° about an axis through the centers of two
                                 opposite faces: From Fig. 16.13(a) we have

90° rotation:             (1234)(5678)                        Cycle structure:                  x4
                                                               180° rotation:              (13)(24)(57)(68)                    Cycle structure:                  x5
                                                               270° rotation:          (1432)(5876)                            Cycle structure:                  x?
                                                Since there are two other pairs of opposite faces, these nine rotations account for
                                    the term 3x} + 6xj in the cycle index.
                              3) Rotations through 180° about an axis through the midpoints of two opposite edges:
                                 As in Fig. 16.13(b), we have the permutation (17)(28)(34) (56), whose cycle structure
                                 is given by x5. With six pairs of opposite edges, these rotations contribute the term
                                 6x}; to the cycle index.
                              4) Rotations through 120° and 240° about an axis through two diagonally opposite
                                 vertices: From part (c) of the figure we have

120° rotation:       (168)(274)(3)(5)                           Cycle structure:                  x73
                                                              240° rotation:       (186)(247)(3)(5)                           Cycle structure:                  x7x3
                          Here there are four such pairs of vertices, and these give rise to the term 8x7x3 in the
                  cycle index.

180°
                                                3                        2       mm                           3           2                         3                                 2
                                                    |                                                             }                                         |
                                                    I                                                             I                                         {
                                                    1                                                             |                                         {
                          4                                      1                     4                              1                 4                              1
                                                    I                                                             I                                         I
                                                    |                                                             |                                         |
                                                    |                                                             \                                         {
                                            7j__|.---L--Jeé
                                            4
                                                                                                          7j-___S ---J6
                                                                                                          7
                                                                                                                                                        7) _\_L__Jg¢
                                                                                                                                                        7
                                        7                                                            oa                                             7
                                    4                                                            4                                              7
                                4                                                            4                                              4

8                             -ls      5                     g                              5                 8                                  9
                                        90°, 180°, 270°                                                                                                               oO           940°
                    (a)                                                          (b)                                              (c)                                          ’
                  Figure 16.13

Therefore, Pg (x1, x2, ..., *3) = (1/24)(x8 + 9x3 + 6x} + 8x7x?), and the pattern in-
                  ventory for these configurations is given by the generating function

f(r, w)                         (1/24)[(r + w)® + 9(r? + w?)* + 60-4 + wt)? + 80r + w)?2r3 + w3)?]
                                                        =r +rlw + 3row? + 3r°w? + 7r4wt + 3r3 wd + 3r2wo + rw? + we.
                          Replacing r and w by 1, we find 23 nonequivalent configurations here.

Since Polya’s Method of Enumeration was first developed in order to count isomers of
                  organic compounds, we close this section with an application that deals with a certain class
792         Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

of organic compounds. This is based on an example by C. L. Liu. (See pp. 152-154 of
                            reference [17].)

Here we are concerned with organic molecules of the form shown in Fig. 16.14, where
      EXAMPLE 16.36
                            C is a carbon atom and X denotes any of the following components: Br (bromine), H
                            (hydrogen), CH3 (methyl), or C2Hs (ethyl). For example, if each X is replaced by H, the
                            compound CH, (methane) results. Figure 16.14 should not be allowed to mislead us. The
                            structure of these organic compounds is three-dimensional. Consequently, we turn to the
                            regular tetrahedron in order to model this structure. We would place the carbon atom at the
                            center of the tetrahedron and then place our selections for X at vertices 1, 2, 3, and 4 as
                            shown in Fig. 16.15.

|                                                      4                             2        4                         2

x——        c ——x                                                                         3                        3          OX
                                        |                                                                  <a!                                         180°

120°, 240°
                                       x                                                (a)                    ‘                  (b)b
                             Figure 16.14                                               Figure 16.15

The group G acting on these configurations is given as follows:

1) The identity transformation (1)(2)(3)(4) with cycle structure x.
                                2) Rotations through 120° or 240° about an axis through a vertex and the center of the
                                   opposite face: As Fig. 16.15(a) shows, we have

120° rotation:        (1)(243) with cycle structure x | x3

240° rotation:        (1)(234) with cycle structure x)x3

By symmetry there are three other pairs of vertices and opposite faces, so these rigid
                                    motions account for the term 8x;x3 in Pg (Xx), X2, x3, X4).
                                3) Rotations of 180° about an axis through the midpoints of two opposite edges: The
                                    case shown        in part (b) of the figure is given                       by the permutation         (14)(23)       whose
                                    cycle structure is x3. With three pairs of opposite edges, we get the term 3x5 in
                                    Pg (x1, X2, 3, X4).
                               Hence        Pg (x1,   X2,   X3,   x4)   =   (1/12)Lx7    +        8X1 X3   +       3x5]   and   Pg 4, 4, 4,   4)   =   (1/12)      ,

[4* + 8(47) + 3(4)] = 36, so there are 36 distinct organic compounds that can be formed
                             in this way.
                                Last, if we wish to know how many of these compounds have exactly two bromine
                             atoms, we let w, x, y, and z represent the “colors” Br, H, CH3, and Co Hs, respectively, and
                             find the sum of the coefficients of w*x?, w?y’, w*z?, w2xy, w*xz, and w’yz in the pattern
                             inventory

(1/12)[(w+xty+z)t4+8wtextytz(wi                                             tx      + yi 423) 4+ 30? x74                    y? 4 2°)7].
                                                                   16.12 The Pattern Inventory: Polya’s Method of Enumeration         793

For (w+x+y+z)*       the relevant term is 6w7x* + 6w?y* + 6w?z* + 12w?xy
                                + 12w*xz + 12w*yz. The middle summand of the pattern inventory does not give rise
                                to any of the desired configurations, whereas in 3(w? + x? + y? + z*)* we find 6w?x? +
                                6wy? + 6w?z?,
                                  Consequently that part of the pattern inventory for the compounds containing exactly
                                two bromine atoms is

(1/12)[12w2x? + 12w?y? + 12w2z? + 12w?xy + 12w?xz + 127 yz]

and there are six such organic compounds.

7. a) In how many ways can we paint the eight squares of a
                                                                             2 X 4chessboard, using the colors red and white? (The back
                                                                             of the chessboard is black cardboard.)
1. a) Find the pattern inventory for the 2-colorings of the edges
                                                                             b) Find the pattern inventory for the colorings in part (a).
  of a square that is free to move in (i) two dimensions; (ii) three
  dimensions, (Let the colors be red and white.)                             c) How many of the colorings in part (a) have four red and
                                                                             four white squares? How many have six red and two white
  b) Answer part (a) for 3-colorings, where the colors are red,
                                                                             squares?
  white, and blue.
                                                                          8. a) In how many ways can we 2-color the eight regions of
2. If a regular pentagon is free to move in space and we can
                                                                             the pinwheel shown in Fig. 16.16, using the colors black and
color its vertices with red, white,   and blue paint,   how   many
                                                                             gold, if the back of each region remains grey?
nonequivalent configurations have exactly three red vertices?
How many have two red, one white, and two blue vertices?                     b) Answer part (a) for the possible 3-colorings, using black,
3. Suppose that in Example 16.35 we 2-color the faces of the                 gold, and blue paints to color the regions.
cube, which is free to move in space.                                        c) For the colorings in part (b), how many have four black,
  a) How    many distinct 2-colorings are there for this situa-              two gold, and two blue regions?
  tion?
  b) If the available colors are red and white, determine the
  pattern inventory.
  c) How many nonequivalent colorings have three red and
  three white faces?
4, For the organic compounds in Example 16.36, how many
have at least one bromine atom? How many have exactly three
hydrogen atoms?
5. Find the pattern inventories for the 2-colorings of the ver-
tices in the configurations in Fig. 16.11, when they are free to
move in space. (Let the colors be green and gold.)
6. a) In how many ways can the seven (identical) horses on
   a carousel be painted with black, brown, and white paint in
   such a way that there are three black, two brown, and two                                 Figure 16.1
   white horses?
   b) Inhow many ways would there be equal numbers of black               9, Letm, n € Z* withn > 3. How many distinct summands ap-
   and brown horses?                                                      pear in the pattern inventory for the m-colorings of the vertices
                                                                          of aregular polygon of n sides?
   ¢) Give a combinatorial argument         to verify that for all
   néZ*,n’ + 6n is divisible by 7.
794      Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

16.13
      Summary and Historical Review
                         Although the notion of a group of transformations evolved gradually in the study of ge-
                         ometry, the major thrust in the development of the group concept came from the study of
                         polynomial equations.
                              Methods for solving quadratic equations were known to the ancient Greeks. Then in
                         the sixteenth century, advances were made toward solving cubic and quartic polynomial
                         equations where the coefficients were rational numbers. Continuing with polynomials of
                         fifth and higher degree, both Leonhard Euler (1707-1783) and Joseph-Louis Lagrange
                         (1736-1813) attempted to solve the general quintic. Lagrange realized there had to be
                         a connection between the degree n of a polynomial equation and the permutation group
                          S,. However,    it was   Niels Henrik Abel    (1802-1829)    who   finally proved that it was not
                         possible to find a formula for solving the general quintic using only addition, subtraction,
                         multiplication, division, and root extraction. During this same period, the existence of a
                         necessary and sufficient condition for when a polynomial of degree n > 5 with rational
                         coefficients can be solved by radicals was investigated and solved by the illustrious French
                         mathematician Evariste Galois (181 1—1832). Since the work of Galois utilizes the structures
                         of both groups and fields, we shall say more about him in the summary of Chapter 17.

Niels Henrik Abel (1802-1829)

Examining pages 278-280 of J. Stillwell [28], one finds that the group concept, and
                         in fact the actual word “group,” first appears in Galois’ work Mémoire sur les conditions
                         de résolubilité des équations par radicaux, published in 1831. Associativity, the group
                         identity, and inverses were consequences of Galois’ assumptions, for he only dealt with
                         a group of permutations of a finite set and his definition of a group required only the
                         closure property. It was Arthur Cayley (1821-1895) (in 1854, in his paper On the Theory
                         of Groups, as Depending on the Symbolic Equation 6” = 1) who first found the need to
                         state the associative property for group elements. The first actual mention of inverses in the
                         definition of a group occurs in the 1883 article Gruppentheoretischen Studien II by Walther
                         Franz Anton von Dyck (1856-1934).
                                                16.13. Summary and Historical Review        795

The concept of the coset, which we introduced in Section 16.3, was also developed by
Evariste Galois (in 1832). The actual term was coined (in 1910) by George Abram Miller
(1863-1951).
    Following the accomplishments of Galois, group theory affected many areas of mathe-
matics. During the late nineteenth century, for example, the German mathematician Felix
Klein (1849-1929), in what has come to be known as the Erlanger Programm, attempted
to codify all existing geometries according to the group of transformations under which the
properties of the geometry were invariant.
    Many other mathematicians, such as Augustin-Louis Cauchy (1789-1857), Arthur Cay-
ley   (1821-1895),   Ludwig   Sylow   (1832-1918),   Richard    Dedekind     (1831-1916),    and
Leopold Kronecker (1823-1891), contributed to the further development of certain types
of groups. However, it was not until 1900 that lists of defining conditions were given for
the general abstract group.
    During the twentieth century a great deal of research took place in the attempt to analyze
the structure of finite groups. For finite abelian groups, it is known that any such group is
isomorphic to a direct product of cyclic groups of prime power order. However, the case
of the finite nonabelian groups has turned out to be considerably more complex. Starting
with the work of Galois, one finds particular attention paid to a special type of subgroup
called a normal subgroup. For any group G, a subgroup H (of G) is called normal if,
for all g € G and all h € H, we have ghg™! € H. In an abelian group every subgroup is
normal, but this is not the case for nonabelian groups. In every group G, both {e} and G are
normal subgroups, but if G has no other normal subgroups it is called simple. During the
past six decades mathematicians have sought and determined all the finite simple groups
and examined their role in the structure of all finite groups. Among the prime movers in the
classification of the finite simple groups are Professors Walter Feit, John Thompson, Daniel
Gorenstein, Michael Aschbacher, and Robert Griess, Jr. For more on the history and impact
of this monumental work we refer the reader to the articles by J. A. Gallian [5], A. Gardiner
[7], M. Gardner [9], R. Silvestri [27], and, especially, the one by D. Gorenstein [13].
    There are many texts one can turn to for further study in the theory of groups. At the
introductory level, the texts by J. A. Gallian [6] and V. H. Larney [16] provide further
coverage beyond the introduction given in this chapter. The text by I. N. Herstein [15] is an
excellent source and includes material on Galois theory.
    More on the RSA public-key cryptosystem of Section 16.4 can be found in the references
by T. H. Barr   [2], P. Garrett [10], and W. Trappe and L. C. Washington          [31]. An early
description of the system is given in the article by M. Gardner [8], where a message is
encrypted using, as the modulus n, the product of a 64-digit prime and a 65-digit prime.
The article by G. Taubes [30] relates the effort set forth by Arjen Lenstra, Paul Leyland,
Michael Graff, and Derek Atkins, along with 600 volunteers, in factoring n.
    The beginnings of algebraic coding theory can be traced to 1941, when Claude Elwood
Shannon began his investigations of problems in communications. These problems were
prompted by the needs of the war effort. His research resulted in many new ideas and
principles that were later published in 1948 in the journal article [26]. As a result of this
work, Shannon is acknowledged as the founder of information theory. After this publication,
results by M. J. E. Golay [11] and R. W. Hamming [14] soon followed, giving further impetus
to research in this area. The 1478 references listed in the bibliography at the end of Volume
II of the texts by F J. MacWilliams and N. J. A. Sloane [18] should convey some idea of
the activity in this area between 1950 and 1975.
796   Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

Our coverage of coding theory followed the development in Chapter 5 of the text
                      by L. L. Dornhoff and F. E. Hohn [4]. The texts by E. F Assmus, Jr, and J. D. Key
                       [1], S. W. Golomb,     R. A. Scholtz, and R. E. Peile [12], V. Pless [20], and S. Roman        [24]
                      provide a nice coverage of topics at a fairly intermediate level. More advanced work in
                      coding can be found in the books by F. J. MacWilliams and N. J. A. Sloane [18], S. Roman
                       [25], and A. P. Street and W. D. Wallis [29]. An interesting application on the use of the
                      pigeonhole principle in coding theory is given in Chapter XI of [29].
                          In Sections 10, 11, and 12 of the chapter, we came upon an enumeration technique whose
                      development is attributed to the Hungarian mathematician George Polya (1887-1985). His
                      article [21] provided the fundamental techniques for counting equivalence classes of chem-
                      ical isomers, graphs, and trees. (To some extent, the ideas in this work were anticipated by
                      J. H. Redfield [23].) Since then these techniques have been found invaluable for counting
                      problems in such areas as the electronic realizations of Boolean functions. Polya’s fun-
                      damental theorem was first generalized in the article by N. G. DeBruijn [3], and other
                      extensions of these ideas can be found in the literature. The article by R. C. Read [22]
                      relates the profound influence that Polya’s Theorem has had on developments in combina-
                      torial analysis. (The issue of the journal that contains this article also includes several other
                       articles dealing with the life and work of George Polya.)
                          Our coverage of this topic follows the presentation given in the article by A. Tucker
                      [32]. A more rigorous presentation of this method can be found in Chapter 5 of the text by
                      C. L. Liu [17].
                          In dealing with Burnside’s Theorem we have another instance of an inaccurate attribution.
                      As we learn in the article by P. M. Neumann [19], the result appears in a paper by Georg
                       Frobenius (1848-1917) that was published in 1887, as well as in some of Cauchy’s work
                       from 1845.

REFERENCES
                            1. Assmus, E. F., Jr., and Key, J. D. Designs and Their Codes. New York: Cambridge University
                              Press, 1992.
                           2. Barr, Thomas H. /nvitation to Cryptology. Upper Saddle River, N. J.: Prentice-Hall, 2002.
                           3. DeBruijn, Nicolaas Govert. “Polya’s Theory of Counting.” Chapter 5 in Applied Combinatorial
                              Mathematics, ed. by Edwin F. Beckenbach. New York: Wiley, 1964.
                           4. Dornhoff, Larry L., and Hohn, Franz E. Applied Modern Algebra. New York: Macmillan, 1978.
                           5. Gallian, Joseph A. “The Search for Finite Simple Groups.” Mathematics Magazine 49, 1976,
                              pp. 163-179.
                           6. Gallian, Joseph A. Contemporary Abstract Algebra, 5th ed. Boston, Mass.: Houghton Mifflin,
                              2002.
                           7. Gardiner, Anthony. “Groups of Monsters.” New Scientist, April 5, 1979, p. 34.
                            8. Gardner, Martin. “A New Kind of Cipher That Would Take Millions of Years to Break.”
                               Scientific American (August 1977): pp. 120-124.
                           9. Gardner, Martin. “The Capture of the Monster: A Mathematical Group with a Ridiculous
                               Number of Elements.” Scientific American 242 (6), 1980, pp. 20-32.
                          10. Garrett, Paul. Making, Breaking Codes: An Introduction to Cryptology. Upper Saddle River,
                              N. J.: Prentice-Hall, 2001.
                          11. Golay, Marcel J. E. “Notes on Digital Coding.” Proceedings of the IRE 37, 1949, p. 657.
                          12. Golomb, Solomon W., Scholtz, Robert A., and Peile, Robert E. Basic Concepts in Information
                              Theory and Coding. New York: Plenum, 1994.
                          13. Gorenstein, Daniel. “The Enormous Theorem.” Scientific American 253 (6), 1985, pp. 104—
                              115.
                          14. Hamming, Richard Wesley. “Error Detecting and Error Correcting Codes.” Bell System
                              Technical Journal 29, 1950, pp. 147-160.
                                                                                                       Supplementary Exercises             797

. Herstein, Israel Nathan. Topics in Algebra, 2nd ed. Lexington, Mass.: Xerox College Publish-
                                        ing, 1975.
                                  16, Larney, Violet H. Abstract Algebra: A First Course. Boston: Prindle, Weber & Schmidt, 1975.
                                  17. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
                                  18. Mac Williams, F. Jessie, and Sloane, Neil J. A. The Theory of Error-Correcting Codes, Volumes
                                        I and Il. Amsterdam: North-Holland, 1977.
                                  19. Neumann, Peter M. “A Lemma That Is Not Burnside’s.” The Mathematical Scientist, Vol. 4,
                                        1979, pp. 133-141.
                                  20. Pless, Vera. Introduction to the Theory of Error-Correcting Codes, 2nd ed. New York: Wiley,
                                         1989.
                                  21. Polya, George. “Kombinatorische Anzahlbestimmungen fiir Gruppen, Graphen und Chemishe
                                        Verbindungen.”   Acta Mathematica 68, 1937, pp. 145-254.
                                  22. Read, R. C. “Polya’s Theorem and Its Progeny.” Mathematics Magazine 60, 1987, pp. 275-282.
                                  23, Redfield, J. Howard. “The Theory of Group Reduced Distributions.” American Journal of
                                        Mathematics 49, 1927, pp. 433-455.
                                  24. Roman, Steven. Introduction to Coding and Information Theory. New York: Springer-Verlag,
                                        1997.
                                  25. Roman, Steven. Coding and Information Theory. New York: Springer-Verlag, 1992.
                                  26. Shannon, Claude E. “The Mathematical Theory of Communication.” Bell System Technical
                                        Journal 27, 1948, pp. 379-423, 623-656. Reprinted in C. E. Shannon and W. Weaver, The
                                        Mathematical Theory of Communication (Urbana: University of [linois Press, 1949).
                                  27, Silvestri, Richard. “Simple Groups of Finite Order.” Archive for the History of Exact Sciences
                                        20, 1979, pp. 313-356.
                                  28. Stillwell, John. Mathematics and Its History. New York: Springer-Verlag, 1989.
                                  29. Street, Anne Penfold, and Wallis, W. D. Combinatorial Theory: An Introduction. Winnipeg,
                                        Canada: The Charles Babbage Research Center, 1977.
                                  30. Taubes, G. “Small Army of Code-breakers Conquers a 129-digit Giant.” Science 264, 1994,
                                        pp. 776-777.
                                  31, Trappe, Wade, and Washington, Lawrence C. Introduction to Cryptography with Coding
                                        Theory. Upper Saddle River, N. J.: Prentice-Hall, 2002.
                                  32. Tucker, Alan. “Polya’s Enumeration Formula by Example.” Mathematics Magazine 47, 1974,
                                        pp. 248-256.

and b+ d     are computed using addition modulo 2. What
              SUPPLEMENTARY EXERCISES                                    is the value of (1, 0) @ (0, 1) @ (I, 1) in this group?
                                                                         b) Now     consider     the     group    (Z2 X Z2 X Z., ®@)   where
                                                                         (a, b,c) B (d,e, f) =(a+d,b+e,c+                    f).   (Here    the
  1, Let f: G — H beagroup homomorphism with e;,; the iden-              sums a+d,b+e,c+ f are computed using addition
tity in H. Prove that                                                    modulo 2.) What do we obtain when we add the seven
     a) K = {x €G| f(x) = ey} is a subgroup of G. (K is                  nonzero (or nonidentity) elements of this group?
    called the kerne/ of the homomorphism.)                              c)   State and prove a generalization that includes the results
    b) ifg ¢Gandx eK, thengxg             'eEK.                          in parts (a) and (b).
2. If G, H, and K are groups and G = H X K, prove that G             7, Let (G, 0) be a group where
contains subgroups that are isomorphic to H and K.
                                                                                  xodoy=boaocSxoy=hboc,
3. Let G be a group where a* = e for all a € G. Prove that G
                                                                     for all a, b, c, x, y € G. Prove that (G, 0) is an abelian group.
is abelian.
                                                                       8. Fork,n eZ withn >k > 1, let O(n, k) count the num-
4, If G is a group of even order, prove that there is an element
                                                                     ber of permutations z € S, where any representation of 7, as
aeéGwitha #eanda=a"!.
                                                                     a product of disjoint cycles, contains no cycle of length greater
5. Let f: G —» H bea group homomorphism onto H. If G is             than k, Verify that
acyclic group, prove that H is also cyclic.                                                            k-l
6. a) Consider the group (Zo X Zo, ®) where, fora, b, c,d €                      Qm+1EQ= > ("J evoe — i,k).
     Z>, (a, b) @ (c,d) =(a+c,b+d)—the              sums    a+c                                         =o   \!
798                Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration

9, For k,n € Z* where n > 2 and 1 <k <n, let P(n, k) de-                11. Wilson’s Theorem    [in part (d) of Exercise    19 of Section
note the number of permutations z ¢€ S, that have k cycles. [For          16.1] tells us that (p — 1)! = —1 (mod p), for p a prime.
example, (1)(23) is counted in P(3, 2), (12)(34) is counted in                a) Is the converse of this theorem true or false
                                                                                                                            — that is,
P(4, 2), and (1)(23)(4) is counted in P(4, 3).]                               ifn e€ Z* and n > 2, does (n — 1)! =—1 (modn) > 7            is
      a) Verify that P(n + 1,k) = P(n,k —1)+nP(n,k).                          prime?
      b) Determine }77_, P(n, k).                                             b) For p an odd prime, prove that
10. Forn > 1, ifo, rt € S,, define the distance d(o, tT) between                               2(p — 3)! =—-1 (mod p).
o and tr by
                                                                          12. In how many ways can Nicole paint the eight regions of the
                                                                          square shown in Fig. 16.17 if
                 d(o, T) = max{jo(i) —   tr) ||1 <i <n}.
                                                                              a) five colors are available?
                                                                              b) she actually uses exactly      four of the   five available
      a) Prove that the following properties hold for d.                      colors?
            i)    d(o,   t) > Oforallo, 7 «€ S,
          ii)     d(o,   t) = Oif and only ifo =t
         iii)     d(o,   t) = d(t, o) forallo,7 € 5S,
         iv)      d(p,   t)<d(p,a)+d(o, Tt), forall o, 0. t € S,
      b) Let € denote the identity element of S,, (that is, €(i) =i
      for all 1 <i <n). If 7 € S, and d(z, €) < 1, what can we
      say about z(n)?
      ¢) Forn > | let a, count the number of permutations 7 in
      S,, where d(zr, €) < 1. Find and solve a recurrence relation                                          '
      for a,.                                                                                Figure 16.17
                  7
Finite Fields and
  Combinatorial
              Designs

[: is time now to recall the ring structure of Chapter 14 as we examine rings of polynomials
                        and their role in the construction of finite fields. We know that for every prime p, (Z,, +, «)
                     is a finite field, but here we shall find other finite fields. Just as the order of a finite Boolean
                     algebra is restricted to powers of 2, for finite fields the possible orders are p", where p is
                     a prime and n € Z*. Applications of these finite fields will include a discussion of such
                     combinatorial designs as Latin squares. Finally, we shall investigate the structure of a finite
                     geometry and discover how these geometries and combinatorial designs are interrelated.

7.1
             Polynomial Rings
                     We recall that a ring (R, +, +) consists of a nonempty set R, where (R, +) is an abelian
                     group, (R, -) is closed under the associative operation -, and the two operations are related
                     by the distributive laws: a(b + c) = ab + ac and (b+ c)a = ba 4+ ca, forall a, b,c ER.
                     (We write ab for a + b.)
                         In order to introduce the formal concept of a polynomial with coefficients in R we let x
                     denote an indeterminate — that is, a formal symbol that is not an element of the ring R. We
                     then use this symbol x to define the following.

Definition 17.1      Given aring (R, +, +),an expression of the form f (x) = a,x” + ay_yx"7! +--+ + ayx! +
                     ayx°, where a; € R for all 0 <i <n, is called a polynomial in the indeterminate x with
                     coefficients from R.
                         If a, is not the zero element of R, then a, is called the leading coefficient of f(x) and we
                     say that f(x) has degree n. Hence the degree of a polynomial is the highest power of x that
                     occurs in a summand of the polynomial. The term aox° is called the constant, or constant
                     term, of f(x).
                         If g(x) = by x” + Dy     x™   | +--+. + bx!    + box is also a polynomial inx over R, then
                     F(x) = g(x) ifm =n anda; = 6; forallO <i <n.
                        Finally, we use the notation R[x] to represent the set of all polynomials in the indeter-
                     minate x with coefficients from R.

799
800         Chapter 17 Finite Fields and Combinatorial Designs

a) Over    the ring   R = (Ze, +, -), the expression                                   5x? + 3x!          —2x°     is a polynomial       of
      EXAMPLE 17.1
                                     degree 2, with leading coefficient 5 and constant term —2x°. As before, here we are
                                     using a to denote [a] in Zs. This polynomial may also be written as 5x* + 3x! + 4x°
                                     since [4] = [—2] in Ze.
                                b) If z is the zero element of ring R, then the zero polynomial zx° = z is also the zero
                                   element of R[x] and is said to have no degree and no leading coefficient. A polynomial
                                   over R that is the zero element or is of degree 0 1s called a constant polynomial. For
                                   example, the polynomial 5x° over Z7 has degree 0 and leading coefficient 5 and is a
                                   constant polynomial.

For a ring of coefficients (R, +, +), let

F(x) = ayx" + ayiix"| +++ tax! + agx®
                                                      g(x)       =   By x”           +   bmx"!                  +--+     +    bx!    +     box®,

where a;, b; € R for allO <i                 <n,O<              j <_m. We introduce (closed binary) operations of
                             addition and multiplication for these polynomials in order to obtain a new ring.
                                   Assume that n > m. We define

fx) +a) = )o@ +bdx"',                                                                          (1)
                                                                                                          i=0
                             where b; = z fori > m, and

F(x)ag(x)          =    (dn Dy, )xrt™             +     (An Dm—|       +    Gn    1D,    xr   rm!

+     +++ +        (aybo + anb,)x! + (aobo)x?.                                                (2)
                                   In the definition of f(x) + g(x), the coefficient (a; + b;), foreachO <7 <n, is obtained
                             from the addition of elements in R. For f(x)g(x), the coefficient of x’ is > i= GQ, —~Dx,
                             where     all additions and multiplications occur within                                    R, and 0 <t<n-+=m.                  Here   is one
                             such example to demonstrate the types of calculations that are involved.
                                   Let f(x) = 4x3 + 2x7 + 3x! + Lx° and g(x) = 3x7 + x! + 2x° be polynomials from
                             Z5[x]. Here

a; = 4,                       az = 2,                   a, = 3,               ay
                                                                                                                                      = 1,
                             and

b>       = 3,             by       = 1,             by   = 2.

For all n > 4 we find that a, = 0. When                           m > 3 we have b,, = 0. Using the definitions in
                             Egs. (1) and (2), where the addition and multiplication of the coefficients are now performed
                             modulo 5, we obtain

f(x) + g(x) = (44 00x73 + (24+ 3)x7 +34                                                Ix! 4+ 1 4+2)x°
                                                                 = 4x3 +0x7 + 4x! 4+ 3x9 = 4x3 44x! 43°
                             and
                                                             5                                   4              3
                                        f(x)g(x) = (>            tsb)                    x     (>: cabs) xo + (> oss)                                   x?
                                                                                                    k=0                                   k=0

+
                                                                                    17.1 Polynomial Rings          801

= (0-2+0-14+4-34+2-0+3-0+4+1-0)x°
                                        +(0-2+4-142-343-041-0)x*
                                        + (4-24+2-143-34+1-0)x°
                                        + (2-243-141-3)x74+(-24+1-Dx'+(1-2)x°
                                      = 2x9 + Oxt + 4x3 + Ox? + 2x! +. 2x9 = 2x5 4 4x3 4 2x! 4 2x9,

The closed binary     operations   defined in Eqs.   (1) and (2) were    designed    to give us the
                  following result.

THEOREM 17.1      If R is aring, then under the operations of addition and multiplication given in Eqs. (1) and
                  (2), (R[x], +, -) is a ring, called the polynomial ring, or ring of polynomials, over R.
                  Proof: The ring properties for R[x] hinge upon those of R. Consequently, we shall prove the
                  associative law of multiplication here, as an example, and shall then leave the proofs of the
                  other properties to the reader. Let h(x) =     t=0 cyx*, with f(x), g(x) as defined earlier.
                  A typical summand in (f (x)g(x))h(x) has the form Ax’, where 0 <t < (m+n)                  + p and
                  A is the sum of all products of the form (a;b;)c,, withO<i<n,0O<j<m,O0<k<p,
                  andi+j+k=t. In f(x)(g(x)h(x)) the coefficient of x’ is the sum of all products of
                  the form a;(b;c,), again withO <i<n,O0<j<m,O0<k<p,andi+j+k =f. Since
                  R is associative under multiplication, (a,b,)cx = a;(bjcx) for each of these terms, and
                  so the coefficient of x’ in (f(x) g(x))A(x) is the same as it is in f(x)(g(x)h(x)). Hence
                  (F(xgaya(x) = fX)(gQh@)).

COROLLARY 17.1    Let R[x] be a polynomial ring.

a) If R is commutative, then R[x] 1s commutative.
                    b) If R is a ring with unity, then R[x] is a ring with unity.
                    c) R[x] is an integral domain if and only if R is an integral domain.
                  Proof: The proof of this corollary is left for the reader.

From this point on, we shall write x instead of x!. If R has unity u, we define x° = u,
                  and for all r € R we write rx° as r.

Let f(x), g(x) € Zg[x] with f(x) = 4x° + 1 and g(x) = 2x + 3. Then f(x) has degree 2
   EXAMPLE 17.2   and g(x) has degree J. From our past experiences with polynomials, we expect the degree
                  of f(x)g(x) to be 3, the sum of the degrees of f(x) and g(x). Here, however, f(x)g(x) =
                  (4x? + 1)(2x + 3) = 8x3 + 12x27 + 2x 4+3 = 4x7 + 2x +3              because   [8] = [0] in Zs.    So
                  degree f(x)g(x) = 2 <3 = degree f(x) + degree g(x).

The cause of the phenomenon in Example 17.2 is the existence of proper divisors of zero
                  in the ring Zg. This observation leads us to the following theorem.
802          Chapter 17 Finite Fields and Combinatorial Designs

THEOREM 17.2                  Let (R, +, +) be acommutative ring with unity u. Then R is an integral domain if and only
                              if for all f(x), g(x) € R[x], if neither f(x) nor g(x) is the zero polynomial, then

degree f(x)g(x) = degree f(x) + degree g(x).
                              Proof: Let f(x) = Dif-p Gx', g(x) = Oy               bjx!, with ay # 2, bm F z. If R is an integral
                              domain, then a,b, # z,so degree f(x)g(x) =n +m = degree f(x) + degree g(x). Con-
                              versely, if R is not an integral domain, leta, b € R witha # z, b # z, but ab = z. The poly-
                              nomials f(x) = ax + u, g(x) = bx + u each have degree 1, but f(x)g(x) = (a+b)x+u
                              and degree f(x)g(x) < 1 <2 = degree f(x) + degree g(x).

Before we can proceed we need to recall an idea that was introduced in Section 14.2 —in
                              Exercise 21. If R is aring with unity wu andr € R, we define r® = u,r! =r,andr"t! = rr
                              for all n € Z*. [From these definitions one can show, for example, that for all m,n           € Zt,
                              (r™)(r") = r'™™ and (r”)" = r™" | So now we continue as follows.
                                  Let R be a ring with unity uw and let f(x) =a,x" +--+ +a;x+an€                  R[x]. If re R,
                              then f(r) =a,r"       +---+a)r+ap           € R. We are especially interested in those values of r
                              for which f(r) = z, and this interest leads us to the following concept.

Definition 17.2         Let R be a ring with unity uw and let f(x) € R[x], with degree f(x) > 1. If re R and
                              f(r) = z, then r is called a root of the polynomial f(x).

a) If f(x) = x? —2 € R[x], then f(x) has /2 and —/2 as roots because (./2)? — 2 =
      EXAMPLE 17.3
                                    0 = (-J/2)?— 2. In addition, we can write f(x) = (x — V2)(x + V2), with
                                    x — J/2,x + /2 © R[x]. However, if we regard f(x) as an element of Q[x], then
                                    f(x) has no roots because /2 and —J/2 are irrational numbers. Consequently, the
                                    existence of roots for a polynomial is dependent on the underlying ring of coefficients.
                                 b) For f(x) = x? + 3x + 2 € Z[x],
                                                                 we              find that

f(0) = O)* +30) +2=2                        fG) = GY +38) +2=20=2
                                            fI=AP      +30) +2=6=0                    = f4) = (4)? +34) +2=30=0
                                            fQ) = (2)? +32)+2=12=0                    = (5) = (5)? +3(5) +2 = 42 =0
                                     Consequently, f(x) has four roots:         1, 2, 4, and 5. This is more than we expected. In
                                    our prior experiences, a polynomial of degree 2 had at most two roots.

In this chapter we shall be primarily concerned with polynomial rings F [x], where F
                              is a field (and F[x] is an integral domain). Consequently, we shall not dwell any further
                              on situations where degree f(x)g(x) < degree f(x) + degree g(x). In addition, unless it
                              is stated otherwise, we shall denote the zero element of a field by 0 and use | to denote its
                              unity.
                                  As a result of Example          17.3(b), we shall now develop the concepts needed to find out
                              when a polynomial of degree n has at most n roots.

Definition 17.3         Let F be a field. For f(x), g(x) € F[x], where f(x) is not the zero polynomial, we call
                              J (x) a divisor (or factor) of g(x) if there exists h(x) € F [x] with f (x)A(x) = g(x). In this
                              situation we also say that f(x) divides g(x) and that g(x) is a multiple of f (x).
                                                                                             171   Polynomial Rings       803

This leads to the division algorithm for polynomials. Before proving the general result,
               however, we shall examine two particular examples.

Early in algebra we were taught how to perform the long division of polynomials with
EXAMPLE 17.4
               real coefficients. Given two polynomials            f(x), g(x) with degree f(x) < degree g(x), we
               organized our work in the form

qi(x) + q2(x) ++-- + 4(x) (= G(x)
                                       f(x) g(x)
                                           fx)aqi(x)
                                              a(x) — f(xyqi(x)
                                              >   ee   8

r(x)
               where we continued to divide until we found either

r(x) =0                or      degree r(x) < degree f(x).

It then followed that g(x) = g(x) f(x) + r(x).
                  For example, if f(x) = x — 3 and g(x) = 7x? — 2x7 + 5x — 2, then f(x), g(x) € Qi]
               (or R[x], or C[x]), and we find

7x7 + 19x         +62        (=4q(x))
                                       x—3) Txe— 24+                       Se—        2
                                                   7x3 — 21x?

19x7+      5x-        2
                                                                19x* — 57x

62x —       2
                                                                          62x   — 186

184 (= r(x))
               Checking these results, we have

g(x) f(x)+r(x)     = (7x? + 19x + 62)(x — 3) + 184 = 7x3 — 2x7 + 5x —2=                              g(x).

The technique illustrated in Example 17.4 also applies when the coefficients of our poly-
EXAMPLE 17.5
               nomials are taken from a finite field.
                  If f(x) = 3x7 +4x +2 and g(x) = 6x4 + 4x? +5x*+3x+41              are polynomials in
               Z;[x], then the process of long division provides the following calculations:

2x74+       x +6           (= 4(x))
                                 3x7 + 4x +2 Joxt + 4x3 4 5x2 43x 41
                                                       6x4+        x3 + 4x?

3x8+      x2 43x41
                                                                  3x3 + 4x2
                                                                          + 2x

4x7+       x4]
                                                                           4x?
                                                                            + 3x +5
                                                                                    Sx +3          (=r(x))
804      Chapter 17 Finite Fields and Combinatorial Designs

Performing all arithmetic in Z7, we find (as in Example 17.4) that
                                             q(x) f(x) + r(x) = (2x7 + x + 6)(3x? + 4x + 2) + (Sx + 3)
                                                                        = 6x4 4+ 4x7 + 5x74                            3x41 =g(x)

We turn now to the general situation.

THEOREM 17.3              Division Algorithm. Let f(x), g(x) € F [x] with f (x) not the zero polynomial. There exist
                          unique polynomials g(x), r(x) € F [x] such that g(x) = g(x)
                                                                                   f (x) + r(x), where r(x) = 0
                          or degree r(x) < degree f(x).
                          Proof: Let S = {g(x) — t(x) f(x)|t(x) € FL[x}}.
                              If 0 € S, then 0 = g(x) — t(x) f(x) for some r(x) € F [x]. Then with g(x) = r(x) and
                          r(x) = 0, we have g(x) = g(x) f(x) + r(x).
                              If 0 ¢ S, consider the degrees of the elements of S, and let r(x) = g(x) — g(x) f(x)
                          be an element in S of minimum degree. Since r(x) # 0, the result follows if degree r(x)
                           < degree f(x). If not, let

r(x) = ayx" + dy—ix"7!                    +++ + anx*
                                                                                             + ayx +0,                                an #0,
                                          F(X) = Dy x” + bm yx!                     ++ + box? + byx + bo,                             bm #0,
                          with n > m. Define

h(x)   =   r(x)   ~~    [anb;,'x"
                                                            m    "| f (x)      =    (ay,   _       Andy" Din) x”   +    (Qn-|   ~~    ndy'    Bm)   x"

tote           (An—m     —_ anb7'by)x"—™             +    Gym    px            +++         f+ ayx   +a.

Then    h(x)     has        degree     less   than   n,    the       degree      of r(x).        More        important,    h(x) =
                           [g(x) — gx)
                                     f (x)] — [andy
                                                x” "1 f x) = g(x) — (g(x) + andy!
                                                                              x" "1 f (x), so A(x) € S
                          and this contradicts the choice of r(x) as having minimum degree. Consequently, degree
                          r(x) < degree f(x) and we have the existence part of the theorem.
                               For uniqueness, let g(x) = q(x) f(x) + r(x) = ga(x) f(x) + ro(x) where r)}(x) = 0
                          or    degree r)(x) < degree f(x), and r2(x) =0 or degree r(x) < degree f(x). Then
                           [q2(x) — qi XA) F@) = ri) — r2(x), and if go(x) — qi(x) #0, then degree ([q2(x) —
                          qi(x)]f (x)) = degree f(x), whereas rj(x) —72(x) =0      or degree [r| (x) — r2(x)] <
                          max{degree r;(x),    degree r2(x)} < degree f(x). Consequently,   g (x) = q2(x),   and
                          r(x) = r2(x).

The division algorithm provides the following results on roots and factors.

THEOREM 17.4              The Remainder Theorem. For f (x) € F[x] anda é€ F, the remainder in the division of f (x)
                          byx — ais f(a).
                          Proof: From the division algorithm, f(x) = g(x)(x — a) + r(x), with r(x) = 0 or degree
                          r(x) < degree (x — a) = |. Hence r(x) =r is an element of F. Substituting a for x, we
                          find f(a) = g(ayia-—a)+r=04+re=r.

THEOREM 17.5              The Factor Theorem. If f(x) € F[x] anda ¢€ F, then x — a is a factor of f(x) if and only
                          ifa is a root of f(x).
                          Proof: Ifx — aisa factor of f(x), then f(x) = g(x)(x — a). With f(a) = g(a)(a — a) = 9,
                          it follows that a is a root of f(x). Conversely, suppose that a is a root of f(x). By the
                                                                                           171   Polynomial Rings         805

division algorithm, f(x)      = g(x)(x — a) +r, wherer € F. Since f(a) = 0 we haver                 = 0,
                      so f(x) = g(x)(x — a), and x — a is a factor of f(x).

EXAMPLE     17.6     a) Let f(x) = x’ — 6x° +. 4x4 — x? + 3x — 7 € Q[x]. From the remainder theorem it
                 *         follows that when f(x) is divided by x — 2, the remainder is

f(2) = 27 — 6(2°) + 4(2*) — 27 + 3(2) -7 = -5.
                             If we were to divide f (x) by x + 1, then the remainder would be f(—1) = —2.
                        b) If g(x) = x° + 3x7 4+ x3 + x7 4+ 2x 4+ 2 € Zs[x] is divided byx — 1, then the remain-
                             der here is g(1) =1+34+1+1+4+2+2=0                     Gin Zs). Consequently, x — 1 divides
                             g(x), and by the factor theorem,

e(x) = q(x)(x — 1)           (where degree g(x) = 4).

Using the results of Theorems 17.4 and 17.5, we now establish the last major idea for
                      this section.

THEOREM 17.6          If f(x) € F[x] has degree n > 1, then f (x) has at most n roots in F.
                      Proof: The proof is by mathematical induction on the degree of f(x). If f(x) has degree
                      1, then f(x)       = ax +b,   fora, be     F, a #0.   With   f(—a7'b)      =0,   f(x)   has at least one
                      root in F’. If c; and cz are both roots, then f(c)) = ac) +b =0 =ac2+b=                       f(co). By
                      cancellation in a ring, ac; + b = acz +b > ac, = acp. Since F                is a field and a # 0, we
                      have ac, = @c2 > c) = C2, so f (x) has only one root in F.
                         Now assume the result of the theorem is true for all polynomials of degree k (> 1) in
                      F [x]. Consider a polynomial f(x) of degree & + 1. If f(x) has no roots in F, the theorem
                      follows. Otherwise, let r € F with f(r) = 0. By the factor theorem, f(x) = (x — r)g(x)
                      where g(x) has degree k. Consequently, by the induction hypothesis, g(x) has at most k
                      roots in F, and f(x), in turn, has at most k + | roots in F.

EXAMPLE     17.7     a) Let f(x)       = x? — 6x +9   € R[x]. Then       f(x)   has at most two roots in R—namely,
                 -           the roots 3, 3. So here we say that 3 is a root of multiplicity 2. In addition f(x) =
                             (x — 3)(x — 3), a factorization into two first-degree, or linear, factors.
                        b) For g(x) = x7 +4 € R[x], g(x) has no real roots, but Theorem                   17.6 is not contra-
                             dicted. (Why?)     In CLx], g(x)    has the roots 2i, —2i   and can be factored as g(x) =
                             (x — 2i)(x + 22).
                        c) If h(x) = x? + 2x + 6 € Z,[x], then #(2) = 0, h(3) = O and these are the only roots.
                             Also,      h(x) = (x — 2)(x — 3) = x7 —5x+6=x7+2x+6,                      because      [—5] = [2]
                             in   Z7.

d) As we saw in Example           17.3(b), the polynomial x* + 3x + 2 has four roots. This is
                             not a contradiction to Theorem        17.6 because Z¢ is not a field. Also, x* + 3x +2          =
                             (x + 1)(x + 2) = (x + 4)(x + 5), two distinct factorizations.

We        close with one final remark,    without proof, on the idea of factorization in Fx].
                      If f(x) € F[x] has degree n, and r),r2,..., 7, are the roots of f(x) in F (where it is
806            Chapter 17 Finite Fields and Combinatorial Designs

possible for a root to be repeated
                                                               — that is, r; = r; for some 1 <i < j <n), then f(x) =
                                An (xX — 711 )(X —1r2)--+ (x — 7,), where a, is the leading coefficient of f(x). This represen-
                                tation of f(x) is unique up to the order of the first-degree factors.

a) f(x), g(x) € QLx], f(x) = x8 + 7x? — 4x4 4 3x3 +
                          EXERCISES 17.1                                  5x* — 4, g(x)
                                                                                     =x —3

1. Let f(x), g(x) € Z7[x] where f(x) = 2x4 + 2x9 + 3x74                 b) f(x), g(x) € Zale], fr) = 01 + x90 $8 pO +
x+4 and g(x) =3x°+5x?+6x+4+1. Determine f(x) +                             le@)=x-1

g(x), f(x) — g(x), and f(x)g (x).                                          ce) f(x), g(x) © Zu lx], f(x) = 3x° — 8x4 txt — x? +
2. Determine all of the polynomials of degree 2 in Z2[x].                4x —7T, g(x) =x4+9

3. How many polynomials are there of degree 2 in Z,,[x]?             10. For each of the following polynomials f(x) € Z,[x], de-
How many have degree 3? degree 4? degree n, for n € N?                termine all of the roots in Z7 and then write f(x) as a product
                                                                      of first-degree polynomials.
4, a) Find two nonzero polynomials         f(x), g(x)    in Zy.[x]
    where f(x)g(x) = 0.                                                   a) f(x) = x2 +5x?4+2x                         +6

b) Find polynomials A(x), k(x) € Z)2[x] such that degree            b) f(x) =x’ —x
      h(x) = 5, degree k(x) = 2, and degree h(x)k(x) = 3.             11. How many units are there in the ring Zs[x]? How many in
5. Complete the proofs of Theorem 17.1 and Corollary 17.1.           Z;|x]? How many in Z,[x], p a prime?

6. For each of the following pairs f(x), g(x), find g(x),            12. Given a field F, let f(x) € F[x]                       where   f(x) = a,x" +
r(x) so that g(x) = q(x) f(x) + r(x), where r(x) = 0 or de-           yx")         + +--+ aox? +.a,x + ay. Prove that x — 1 is a fac-
gree r(x) < degree f(x).                                              tor of f(x) if and only if

a) f(x), g(x) € Qia],     f(x) =x*—5x9       47x,     g(x) =                       Gn + Qn) +--+ +42 +a) +a = 0.
      x? — 2x24   5x —3                                               13. Let R, S be rings, and let g: R > S be a ring homomor-
      b) f(x), 9) € Z[x], fe) =P +h g@)axttxe+                        phism. Prove that the function G: R[x] — S[x] defined by
      x +x 41
      ce) f(x), g(x) € Zs[x], f(x) = x? +3x + Logix) = at +                                   G (s             r']      = 3 g(r, )x'
      2x>+x4+4                                                                                         1=0                i=0
                                                                      is aring homomorphism.
7. a) If f(x) = x* — 16, find its roots and factorization in
    QLx].                                                             14, If R is an integral domain, prove that if f(x) is a unit in
                                                                      R[x], then f(x) is a constant and is a unit in R.
      b) Answer part (a) for f(x) € R[x].
                                                                      15. Verify that f(x) = 2x + lisaunitin Z,4[x]. Does this con-
      c) Answer part (a) for f(x) € C[x].
                                                                      tradict the result of Exercise 14?
      d) Answer parts (a), (b), and (c) for f(x) = x* — 25.
                                                                      16. Forn € Z*,n > 2, let f(x) € Z,,[x]. Prove that if a, b ¢Z
8. a) Find all roots of f(x) = x? + 4x if f(x) € Zp[x].              and a = b (mod), then f(a) = f(b) (mod n).
      b) Find four distinct linear polynomials g(x), h(x), s(x),      17. If   F    is    a   field,         let   SC    Fl[x]   where   f(x) =a,x"+
      t(x) € Z;2[x] so that f(x) = g(QX)A(x) = s(x)t (x).             Gy x"!       40+) fax? +ayx +a9 eS                          if and only   if a,+
      c) Do the results in part (b) contradict the statements made    Gn-1 +°++ +4) +4, + a9 = 0. Prove that S is an ideal
                                                                                                                       of F [x].
      in the paragraph following Example 17.7?                        18. Let (R, +, +) be a ring. If / is an ideal of R, prove that
9. In each of the following, find the remainder when f(x) is         i[x], the set of all polynomials in the indeterminate x with
divided by g(x).                                                      coefficients in J, is an ideal in R[x].

17.2
      Irreducible Polynomials: Finite Fields
                                We now wish to construct finite fields other than those of the type (Z,, +, +), where p isa
                                prime. The construction will use the following special polynomials.
                                                                   172. Irreducible Polynomials: Finite Fields   807

Definition 17.4   Let f(x) € F[x], with F a field and degree f(x) > 2. We call f(x) reducible (over F) if
                     there exist g(x), h(x) € F[x], where f(x) = g(x)h(x) and each of g(x), h(x) has degree
                     > 1. If f(x) is not reducible it is called irreducible, or prime.

Theorem 17.7 contains some useful observations about irreducible polynomials.

THEOREM 17.7         For polynomials in F[x},

a) every nonzero polynomial of degree < 1 is irreducible.
                       b) if f(x) € F[x] with degree f(x) = 2 or 3, then f(x) is reducible if and only if f(x)
                          has a root in the field F.

Proof: The proof is left for the reader.

a) The polynomial x? + 1 is irreducible in Q[x] and R[x], but in C[x] we find x? + 1 =
   EXAMPLE 17.8
                          (x + 7)(x — i).
                       b) Let f(x) = x4 +4+2x?+1¢RL[x]. Although f(x) has no real roots, it is reducible
                          because (x? + 1)* = x+ + 2x? + 1. Hence part (b) of Theorem 17.7 is not applicable
                          for polynomials of degree > 3.
                       c) In Zo[x], f(x) = x3 + x7 4+.x +1 is reducible because f(1) = 0. But g(x) =x? +
                          x + | 1s irreducible because g(0) = g({1) = 1.
                       d) Let h(x) = x4 + x3 +47 +x + 1 € Zp[x]. Is A(x) reducible in Z2[x]? Since h(0) =
                          h(1) = 1,    A(x) has no first-degree factors, but perhaps we can finda, b, c, d € Z2 such
                          that (x* +ax + b)(x*            +ex4+d)=x442 4274x411.
                              By expanding (x? + ax +b)(x* +.cx +d) and comparing coefficients of like
                          powers of x, we finda+c=1,ac+b+d=1, ad+bc = 1, and bd = 1. With
                          bd = 1, it follows that b= 1 andd=1, soac+b+d=1l>ac=|l>a=c=
                          1=>a+ce=0. This contradicts a+c=1.      Consequently, A(x) is irreducible
                          in Z2[x].

All of the polynomials in Example 17.8 share a common property, which we shall now
                     define.

Definition 17.5   A polynomial f(x) € F[x] is called monic if its leading coefficient is 1, the unity of F.

Some of our next results (up to and including the discussion in Example 17.11) awaken
                     memories of Chapters 4 and 14.

Definition 17.6   If f(x), g(x) € FLx], then h(x) € F [x] is a greatest common divisor of f(x) and g(x)

a) if h(x) divides each of f(x) and g(x), and
                       b) if k(x) € F [x] and k(x) divides both f(x), g(x), then k(x) divides A(x).
808          Chapter 17 Finite Fields and Combinatorial Designs

We now state the following results on the existence and uniqueness of what we shall
                              call the greatest common divisor, which we shall abbreviate as gcd. Furthermore, there is a
                              method for finding this gcd that is called the Euclidean algorithm for polynomials. A proof
                              for the first result is outlined in the Section Exercises.

THEOREM 17.8                  Let f(x), g(x) € F[x], with at least one of f(x), g(x) not the zero polynomial. Then each
                              polynomial of minimum degree that can be written as a linear combination of f(x) and
                              g(x) —that is, in the form s(x)
                                                           f (x) + t(x) g(x), for s(x), t(x) € F[x]— will be a greatest
                              common        divisor of f(x), g(x). If we require a gcd to be monic, then it will be unique.

THEOREM 17.9                  Euclidean Algorithm for Polynomials. Let f (x), g(x) € F[x] with degree f(x) < degree
                              g(x) and f(x) # 0. Applying the division algorithm, we write
                                               a(x) = g(x) f(x) +r),                  degree r(x) < degree f(x)

F(x) =@qiQ®)r@)+ry(),                  degree r)(x) < degree r(x)

r(x) = qa(x)ri(%) + ro(x),             degree r2(x) < degree r) (x)

ry—-2(X) = Ge(X)re_-1(%) + r(x),          degree r;, (x) < degree ry_1 (x)
                                            Fe-1(X) = Gey i re (x) + rei),            rei(x) = 0.
                                 Then 7; (x), the last nonzero remainder, is a greatest common divisor of f (x), g(x), andis
                              aconstant multiple of the monic greatest common divisor of f(x), g(x). [Multiplying 7; (x)
                              by the inverse of its leading coefficient allows us to obtain the unique monic polynomial
                              we Call the greatest common divisor.|

Definition 17.7         If f(x), g(x) € F[x] and their ged is 1, then f(x) and g(x) are called relatively prime.

The last results we need to construct our new finite fields provide the analog of a con-
                              struction we developed in Section 14.3.

THEOREM 17.10                 Let s(x) € F(x), s(x) # 0. Define relation R on F[x] by f(x) R g(x) if f(x) — g(x) =
                              t(x)s(x), for some f(x) € F[x]— that is, s(x) divides f(x) — g(x). Then & is an equiva-
                              lence relation on Fx].
                              Proof: The verification of the reflexive, symmetric, and transitive properties of & is left for
                              the reader.

When the situation in Theorem        17.10 occurs, we say that f(x) is congruent to g(x)
                              modulo s(x) and write f (x) = g(x) (mod s(x)). The relation & is referred to as congruence
                              modulo s(x).
                                 Let us examine the equivalence classes for one such relation.

Let s(x) =x? +x+1           € Z[x]. Then
      EXAMPLE 17.9
                                 a) (0) =f? t+x41 =(0,     x2? 4x41 4x? +x, 44+ D072 4x41),..3
                                        = {t(x)(x? +x + 1)|t(x) € Za[x]}
                                               172. Irreducible Polynomials: Finite Fields                  809

b) (1) = (1, x24 4, x07 4+x4+D41,04+)D02+x+1)41,...3
          = {t(x)(x* +x + 1) + I\t(x) € Z2[x]}
  ce) Ix] = fx, x? + 1x?    4+xe4 I 44,04 D027 4x41) 4+4x,...}
          = {t(x)(x? +x + 1) + xlt(x) € Zo[x]}
  d) [x  + 1) = {x + 1,x7, x07? +e 4+ D404), 0400? +x4+1
                  +(xet1),...}= (G7            4+x4+D4+@ 4 Dit) € Zoix]}
    Are these all of the equivalence classes? If f(x) € Z.[x], then by the division algo-
rithm f(x) = q(x)s(x) +r(x), where r(x) =0 or degree r(x) < degree s(x). Since
F(x) — r(x) = q(x)s(x), it follows that f(x) =r(x) (mod s(x)), so f(x) € [r(x)].
Consequently, to determine all the equivalence classes, we consider the possibilities for
r(x). Here r(x) = 0 or degree r(x) < 2, so r(x) = ax + b, where a, b € Z. With only
two choices for each of a, b, there are four possible choices for r(x): 0, 1, x, x + 1.

We now place a ring structure on the equivalence classes of Example 17.9. Recalling
how this was accomplished in Chapter 14 for Z,,, we define addition by [ f (x)] + [g(x)] =
[f(x) + g(x)]. Since degree (f(x) + g(x)) < max{degree f(x), degree g(x)}, we can find
the equivalence   class for [ f(x) + g(x)]    without   too much         trouble.   Here,       for example,
[x] + [x +1) = [x +         4 1)] = [2x + 1] = [1] because2 = 0 in Z.
    In defining the multiplication of these equivalence classes, we run into a little more diffi-
culty. For instance, what is [x][x] in Example 17.9? If, in general, we define [ f (x) ][g(x)] =
[f(x)g(x)], it is possible that degree f(x)g(x) > degree s(x), so we may not readily
find [ f (x)g(x)] in the list of equivalence classes. However, if degree f(x)g(x) > degree
s(x), then using the division algorithm, we can write f (x)g(x) = ¢(x)s(x) + r(x), where
r(x) = 0 or degree r(x) < degree s(x). With f(x) g(x) = g(x)s(x) + r(x), it follows that
i (x)g(x) = r(x) (mod s(x)), and we define [ f(x) g(x)] = [r(x)], where [r(x)] does occur
in the list of equivalence classes.
    From these observations we construct Tables 17.1 and 17.2 for the addition and multi-
plication, respectively, of {[0], [1], [x], [xy + 1]}. (in these tables we write a for [a].)

Table 17.1                                              Table 17,2

+        0         1        x      x+1                           0          1           x        x+1

0       0)        l        x      x+l                  0        0         0            0           0
     1        l        0      x+1         x                  1       0          l           x        x+ |
    Xx       x      x+1         0         ]                 Xx       0         x       x+1              l
x+1]x«x4+1           x         1         0               x+1/0              «41            l          x

From the multiplication table (Table 17.2), we find that these equivalence classes form
not only a ring but also a field, where [1]~! = [1], Lx]! = [x + 1, and [x + 17! = [x].
This field of order 4 is denoted by Z2[x]/(x* + x + 1), and we observe that it contains (an
isomorphic copy of) the subfield Z>. [In general, a subring (R, +, -) of a field (F, +, +)
is called a subfield when (R, +, -) is a field.] In addition, for the nonzero elements of this
field we find that [x]! = [x], [x]? = [x + 1], [x}? = [1], so we have acyclic group of order
3. But the nonzero elements of any field form a group under multiplication, and any group
of order 3 is cyclic, so why bother with this observation? In general, the nonzero elements
of any finite field form a cyclic group under multiplication. (A proof for this can be found
in Chapter 12 of reference [10}.)
810         Chapter 17. Finite Fields and Combinatorial Designs

The preceding construction is summarized in the following theorem. An outline of the
                              proof is given in the Section Exercises.

THEOREM 17.11                 Let s(x) be a nonzero polynomial in F [x].

a) The equivalence classes of F [x] for the relation of congruence modulo s(x) forma
                                    commutative ring with unity under the closed binary operations

[fo] + [ge] = [LF O) + g@)),                 [Lf )Ilg@] = [Ff @&)g&)] = [rr],
                                    where r(x) is the remainder obtained upon dividing f(x)g(x)               by s(x). This ring is
                                    denoted by F[x]/(s(x)).
                                b) If s(x) is irreducible in F[x], then F[x]/(s(x)) is a field.
                                c) If |F | = ¢ and degree s(x) = n, then F[x]/(s(x)) contains g” elements.
                                 Before    we continue we wish to emphasize         that for s(x)      irreducible in F[x]   the ele-
                              ments in the field F[.x]/(s(x)) are not simply polynomials (in x). But how can this be, con-
                              sidering the presence of the symbol x in each of the elements [x] and [x + 1] in the field
                              Z>[x]/(x? + x + 1) of Example 17.9? In order to make our point more apparent we consider
                              an infinite example that is somewhat familiar to us.

Here we let F = (R, +, -), the field of real numbers, and we consider the irreducible poly-
      EXAMPLE 17.10
                              nomial s(x) = x? + lin R[x]. From part (b) of Theorem            17.11 we learn that R[x]/(s(x)) =
                              R[x]/(x? + 1) is a field.
                                 For all f(x) € R[x] it follows by the division algorithm that

f(x) = g(x)(x? +1)+r(x),             where r(x) = Oor0 < degr(x) <1.

Therefore,

R[x]/(x* + 1) = {la + bx]la, b € R},
                              where it can be shown that [a + bx] = [a] + [bx] = [a] + [b][+].
                                 Among the (infinitely many) elements of R[Lx]/(x? + 1) are the following:
                                  1) [1] = {1 + t(x)(x? + 1)|t (x) € RLx]}, where we find the elements x? + 2 and 3x? +
                                     3x + 1 (from R[x]});
                                  2) [r] = {r + t(x) (x? + 1)|t(x) € R[x]}, where r is any (fixed) real number;
                                  3) [-1] = {-1 + t(x)(x? + D|t(x) € R[x]},            where   we   find the polynomial        —1 +
                                     (1)(x* +1) = x* —s0, [x][x] = [x7] = [-1]; and
                                  4) [V2x — 3] = {(V2x — 3) + (x)? + Dit) € REx]}.
                              Now let us consider the field (C, +, -) of complex numbers and the correspondence

h: R[x]/(Qx? + 1) > C,
                              where A([a + bx]) =a+ bi.
                                 For all [a+ bx], [ce +dx]eR[x]/(x7 +1),                  we    have      fa+bx]     =[ctdx]       ao
                              (a + bx) —(c + dx) = t(x)(x? +1),           for   some    t(x) Ee R[x] &       (a—c)   + (b-d)x           =
                              t(x)(x? + 1). If f(x) is not the zero polynomial, then we have (a — c) + (b — d)x, a poly-
                              nomial of degree less than 2, equal to f(x)(x* + 1), a polynomial of degree at least 2.
                              Consequently,     t(x) = 0, soa+bx       =c+dx       and a=c,     b=d.      This guarantees that the
                                                              17.2. Irreducible Polynomials: Finite Fields          811

correspondence given by /: is actually a function. In fact, # is an isomorphism of fields.
                (See Exercise 24 in the exercises at the end of this section.) To establish that / preserves
                the operation of multiplication, for example, we observe that

h(fa + bx]f[e + dx]}) = A([ac + adx + bex + bdx?])
                                                  = h([ac + (ad + be)x] + [bd][x7])
                                                  = h(fac + (ad + be)x] + [bd][-1])
                                                  = h([ac — bd) + (ad + be)x]})
                                                  = (ac — bd) + (ad + bc)i = (a + bi)(e + di)
                                                  =h([a + bx})h([e + dx]).

Since R[x]/(x* + 1) is isomorphic to C, the correspondence /([x]) = i makes us think
                of [x] as a number in R[x]/(x? + 1) and not as a polynomial           in x (in R[x]). The number
                [x] represents an equivalence class of polynomials in R[x], and this number [x] behaves
                like the complex   number i in the field (C, +, -). We      should also note that for each real
                number r, A([r]) =r, and {[r]|r € R} is a subfield of R[x]/(x? + 1), which is isomorphic
                to the subfield R of C.
                   Finally, if we identify the field R[x ]/ (x? + 1) with the field (C, +, -), we can summarize
                what has happened above as follows: We started with the irreducible polynomial s(x) =
                x* + 1 in R[x], which had no root in the field (R, +, -). We then enlarged (R, +, +) to
                (C, +, -) and in C we found the root i (and the root —i) for s(x), which can now be
                factored as (x + i)(x — i) in C[x].

Since our major concern in the chapter is with finite fields, we now examine another
                example of a finite field that arises by virtue of Theorem 17.11.

In Z3[x] the polynomial s(x) = x? + x +2       is irreducible because s(0) = 2, s(1) = 1, and
EXAMPLE 17.11
                s(2) = 2. Consequently, Z3[x]/(s(x)) is a field containing all equivalence classes of the
                form [ax + b], where a, b € Z;. These arise from the possible remainders when a polyno-
                mial f(x) € Z3[x] is divided by s(x). The nine equivalence classes are [0], [1], [2], [x],
                [x + 1], [x + 2], [2x], [2x + 1], and [2x + 2].
                    Instead of constructing a complete multiplication table, we examine four sample multi-
                plications and then make two observations.

a) [2x][x] = [2x7] = [2x* + 0] = [2x? + (x? +x 4+2)] = [3x27 +e +2] = [x +2]
                     because3 = Oin Z3.
                  b) [x+ Lx +2] = [x? + 3x +2]= [x?
                                                  + 2] = [x2 +24                      200? + x + 2)] = [24].
                  c) [2x + 2)? = [4x* + 8x +4] = [x2 4+ 2x + 1) = [(—x —2) 4+ 2x +1]                         since x7 =
                     (—x — 2) (mod s(x)), Consequently, [2x + 2]? = [x — 1] = [x + 2].
                  d) Often we write the equivalence classes without brackets and concentrate on the coef-
                     ficients of the powers of x. For example, 11 is written for [x + 1] and 21 represents
                     [2x + 1]. Consequently, (21) - (12) = [2x + 1][x + 2] = [2x7 4+5x+2]=
                     [2x* + 2x + 2] = [2(—x — 2) + 2x + 2] = [-4 + 2] = [—2] = [1],s0 (21)! = (12).
                  e) We also observe that

[x]! = [x]            [x] = [2x +2]        9 [xP = [2x]                 [x]? =[x¥ +1]
                          [xP =[2x+1]            [x}* = [2]            [x]®=[x+2]                 [xP =[1]
812          Chapter 17 Finite Fields and Combinatorial Designs

Therefore the nonzero elements of Z3[x]/(s(x)) form a cyclic group under multipli-
                                    cation.
                                 f) Finally, when we consider the equivalence classes [0], [1], [2], we realize that they
                                     provide   us with   a subfield of Z3[x]/(s(x))
                                                                                 —a          subfield we identify    with the field
                                     (Z3, +, +).

In Example 17.9 (and in the discussion that follows it) and in Example            17.11, we con-
                              structed finite fields of orders 4 (= 27) and 9 (= 3%), respectively. Now we shall close this
                              section as we investigate other possibilities for the order of a finite field. To accomplish this
                              we need the following idea.

Definition 17.8         Let (R, +, +) be a ring. If there is a least positive integer n such that nr = z (the zero of
                              R) for all ry € R, then we say that R has characteristic n and write char(R) = n. When no
                              such integer exists, R is said to have characteristic 0.

a) The ring (Z3, +, -) has characteristic 3; (Z4, +, -) has characteristic 4; in general,
      EXAMPLE 17.12
                                    (Z,,, +, +) has characteristic n.
                                 b) The rings (Z, +, -) and (Q, +, -) both have characteristic 0.
                                  c) Aring can be infinite and still have positive characteristic. For example, Z3[x] is an
                                     infinite ring but it has characteristic 3.
                                 d) The ring in Example 17.9 has characteristic 2. In Example 17.11] the characteristic of
                                    the ring is 3. Unlike the examples in part (a), the order of a finite ring can be different
                                    from its characteristic.
                                         Examples    17.9 and 17.11, however, are more than just rings. They are fields with
                                     prime characteristic. Could this property be true for all finite fields?

THEOREM 17.12                 Let (F, +, +) bea field. If char(¥)      > 0, then char(F’) must be prime.
                               Proof: In this proof we write the unity of F as u so that it is distinct from the positive integer 1.
                              Let char(F) = n > 0. If is not prime, we writen = mk, where m, k € Z* andl <m <n,
                               1 <k <n. By the definition of characteristic, nu = z, the zero of F. Hence (mk)u = z. But

(mk)\(u) =(u+ut---+u)=(Uut+ut---+u)(utut---+u)
                                                                          = (mu)(ku).
                                                     mk summands             m summands           kK summands

With F a field, (mu)(ku)       = z => (mu) = z or (ku) = z. Assume without loss of generality
                              that ku = z. Then for each rr € F, kr = k(ur) = (ku)r = zr = z, contradicting the choice
                              of n as the characteristic of F. Consequently, char(F)        is prime.

(The proof of Theorem 17.12 actually requires that F only be an integral domain.)

If F isa finite field and m = |F{, then ma = z foralla € F because (F, +) is an additive
                              group of order m. (See Exercise 8 of Section 16.3.) Consequently, F has positive charac-
                              teristic and by Theorem 17.12 this characteristic is prime. This leads us to the following
                              theorem.
                                                                                         172. Irreducible Polynomials: Finite Fields            813

THEOREM 17.13                    A finite field F has order p', where p is a prime and ¢ € Z*.
                                 Proof: Since F is a finite field, let char(¥') = p, a prime, and let u denote the unity and z the
                                 zero element. Then So = {u, 2u, 3u,...,                pu = z} is a set of p distinct elements in F. If
                                 not, mu = nu forl <m <n< pand(n—m)u =z, withO<n—m                                           < p.Soforallx
                                                                                                                                        e F
                                 we now find that (n — m)x = (n — m)(ux) = [( — m)u]x = zx = z, and this contradicts
                                 char(F) = p. If F = So, then |F{ = p! and the result follows. If not, leta € F — So. Then
                                 S, = {ma +nu{0 < m,n < p} is a subset of F with |S;| < p?. If |S;| < p*, then mya +
                                 nu    = ma     + nu, withO < m), m2, n|, m2 < p andat least one of m; — m2, nn. —n,                            #0.
                                 Should m, — m2 = 0, then (m, — m2)a = z = (nm. — n))u, with O < |n2 — n|| < p. Con-
                                 sequently, for all x € F, |n2 — ny{x = |n2 — 1 |(ux) = (|no — ny lu)x = zx =z withO<
                                 |n2 — n|| < p = char(F), another contradiction. Ifn,; — n2 = 0, then (m,                       — m2)a = z with
                                 0 < |m, — m2| < p. Since F is a field anda # z we know that a“! € F, so |m, — m2|u =
                                 |m, — mo|aa~! = za7! = z with 0 < |m, — m2| < p—yet another contradiction. Hence
                                 neither    m1) — m2    nor     n, — nz    is 0. Therefore,      (m,    — m2)a    = (no —n\)u        # z.   Choose
                                 keZ*       such that O<k < p and k(m, — m2) =1(mod p). Then a = k(m, — m2)a =
                                 k(n2 — n,)u,      and a € So, one         more   contradiction.       Hence   |S)| = Dp’,    and   if # = S,    the
                                 theorem is proved. [f not, continue this process with an element b € F — S$). Then S) =
                                 {€b +ma+nu|0 < £, m,n < p} will have order p?. (Prove this.) Since F is finite, we
                                 reach a point where F = S,_, for some t € Zt, and |F| = |S,_,| = p’.

As aresult of this theorem there can be no finite fields with orders such as 6, 10, 12, 14,
                                 15, .. .. In addition, for each prime p and eacht € Z", there is really only one field of order
                                 p'. Any two finite fields of the same order are isomorphic. These fields were discovered
                                 by the French mathematician Evariste Galois (1811-1832) in his work on the nonexistence
                                 of formulas for solving general polynomial equations of degree > 5 over Q. As a result, a
                                 finite field of order p’ is denoted by G Fp’), where the letters GF stand for Galois field.

7. An outline for a proof of Theorem 17.8 follows.
                         EXERCISES 17.2
                                                                                  a) Let S = {s(x) f(x) + t(x)g(x)|s(x), (x) € F[x]}. Se-
  1. Determine whether or not each of the following polynomi-                     lect an element m(x) of minimum degree in S. (Recall that
als is irreducible over the given fields. If it is reducible, provide             the zero polynomial has no degree, so it is not selected.)
a factorization into irreducible factors.
                                                                                  Can we guarantee that m(x) is monic?

a) x7 +3x —loverQ,R,C                                                         b) Show that if A(x) € F[x] and h(x) divides both f(x)
                                                                                  and g(x), then A(x) divides m(x).
    b) x4 —2 over Q,R,C
                                                                                  c) Show that m(x) divides f(x). If not, use the divi-
    ce) x? +x+1      over Z3, Zs, Z7                                              sion algorithm and write f(x) = g(x)m(x) + r(x), where
    d) x*+x«3+1 over Z>                                                           r(x) # 0 and degree r(x) < degree m(x). Then show that
    e) x9 4+ 3x? —x 4+ 1 over Zs                                                  r(x) € S and obtain a contradiction.

2. Give an example of a polynomial f(x) € R[x] where f (x)                      d) Repeat the argument in part (c) to show that m(x) di-
has degree 6, is reducible, but has no real roots.                                vides g(x).
3. Determine all polynomials f(x) € Zs[x] such               that   1 <      8. Prove Theorems        17.9 and 17.10.

degree f(x) <3 and f(x) is irreducible (over Zz).
                                                                               9. Use the Euclidean algorithm for polynomials to find the gcd
4, Let f(x) = (2x? + 1)(5x3 — 5x + 3)(4x — 3) € Z [x].                      of each pair of polynomials, over the designated field F. Then
Write f(x)   as the product of a unit and three monic poly-                 write the gcd as s(x) f(x) + #(x)g(x), where s(x), (x) € Fx].
nomials.
                                                                                  a) f(x) =x? 4+x-2, 9(x) =x —xt 4x3 4x? -
5. How many monic polynomials in Z;[x] have degree 5?                            x — Lin Q[x]
6. Prove Theorem     17.7.                                                       b) f(x) =x44x34-1,
                                                                                              g(x) =x? 4+x4 1inZ[x]
814            Chapter 17 Finite Fields and Combinatorial Designs

c) f(x) = x4 42x? +.2x 4.2,          (x) = 2x34 2x7       4+           17. For p a prime, let s(x) be irreducible of degree n in Z,,[x].
      x+1lin Z3[x]                                                                 a) How many elements are there in the field Z,,[x]/(s(x))?
10. If F is any field, let f(x), g(x) € F[x]. If f(x), g(x) are                    b) How many elements in Z,[x]/(s(x)) generate the
relatively prime, prove that there is no element a € F with                        multiplicative group of nonzero elements of this field?
f(a) = 0 and g(a) = 0.
                                                                             18. Give the characteristic for each of the following rings:
11. Let f(x), g(x) € R[x] with f(x) = x3 4+2x?+ ax —b,
                                                                                   a) Z),                      b)   Zi, [x]           ¢) QLx]
g(x) = x3 +x? — bx +a. Determine values for a, b so that
the gcd of f(x), g(x) is a polynomial of degree 2.                                 d) Z[/5] = {a + bV/5|a, b € Z}, under the binary oper-
                                                                                   ations of ordinary addition and multiplication of real num-
12. For Example 17.9, determine            which   equivalence       class         bers.
contains each of the following:
                                                                             19. In each of the following rings, the operations are compo-
      a) xtt+x3+x41                                                          nentwise addition and multiplication, as in Exercise 18 of
      b) x? +x7+1                                                            Section 14.2. Determine the characteristic in each case.
      ce) xttxi   tx?        +1                                                    a)   Z>   xX    Z3          b)   Z3   x    Z4      c)    Z4    x   Ze

13. An outline for the proof of Theorem        17.11 follows.                      d) Z,,, X Z,,form,néZt,m,n>2
      a) Prove that the operations defined in part (a) of The-                     e) Z; XZ
      orem 17.11 are well-defined by showing that if f(x) =                  20. For Theorem 17.13, prove that |S2| = p>.
      fi(x) (mod s(x))   and   g(x) = 2)(x) (mods(x)),    then
                                                                             21. Find        the    orders   n for all fields      GF(n),        where     100<
      f(x) + g(x) = fi) + gi(%) (mod s(x)) and f(x) g(x) =                   n < 150.
      fi (x)gi (x) (mod s(x)).
                                                                             22.   Construct a finite field of 25 elements.
      b) Verify the ring properties for the equivalence classes in
      F[x]/(s(x)).                                                           23.   Construct a finite field of 27 elements.

c) Let f(x) € F[x], with f(x) # O and degree f(x) < de-                24, a) Prove that the function / in Example 17.10 is one-to-
      gree s(x). If s(x) is irreducible in F [x], why does it follow             one and onto and preserves the operation of addition.
      that 1 is the gcd of f(x) and s(x)?                                          b) Let (F, +, -) and(K, ®, ©) be two fields. Ifg: F > K
      d) Use part (c) to prove that if s(x) is irreducible in F[x],                is a ring isomorphism and a is a nonzero element of F (that
      then F[x]/(s(x)) is a field.                                                 is, 2 is a unit in F), prove that g(a~') = [g(a)]~!. (Con-
      e) If |F| = g and degree s(x) = n, determine the order of                    sequently, this function g establishes an isomorphism of
                                                                                   fields. In particular, the function 2 of Example 17.10 is
      F[x]/(s(x)).
                                                                                   such a function.)
14. a) Show that s(x) = x? + 1 is reducible in Z>[x].
                                                                             25. a) Let Q[/2] = {a + bvV/2\a, b € Q}. Prove that
      b) Find the equivalence classes for the ring Z>[x]/(s(x)).
                                                                                   (Q[/2], +, +) is a subring of the field (R, +, +). (Here the
      c) Is Z2[x]/(s(x)) an integral domain?                                       binary operations in R and Q[V2] are those of ordinary
15. For the field in Example 17.11, find each of the following:                    addition and multiplication of real numbers.)

a) [x + 2][2x +2] + [x + 1]                                                  b) Prove that Q[V/2] is a field and that Q[x]/(x? — 2) is
                                                                                   isomorphic to Ql V2].
      b) [2x + 1x      +2]
                                                                             26. Let p be a prime. (a) How many monic quadratic (degree
      ce) (22)7! = [2x + 2]7'
                                                                             2) polynomials x* + bx + c in Z,[x] can we factor into linear
16. Let s(x) = x4 42° 4 1 € Zl x].                                           factors in Z,[x]? (For example, if p = 5, then the polynomial
      a) Prove that s(x) is irreducible.                                     x? + 2x + 2in Zs[x] would be one of the quadratic polynomials
      b) What is the order of the field Z2[x]/(s(x))?                        for which we should account, under these conditions.) (b) How
                                                                             many quadratic polynomials ax* + bx + c in Z,[x] can we fac-
      ce) Find [x2 + x + 1]7! in Zo[x]/(s(x)). (Hint: Find a, b,
                                                                             tor into linear factors in Z,,[x]? (c) How many monic quadratic
      c, d € Z, so that [x* +.x + 1][ax? + bx? +ex+d]
                                                                             polynomials x? + bx +c in Z,[x] are irreducible over Z,?
      = [1].)                                                                (d) How many quadratic polynomials ax? +bx +c in Z,[x]
      d) Determine [x? + x + 1][x? + 1] in Z[x]/(s()).                       are irreducible over Z,?
                                                                                                         173 Latin Squares             815

17.3
            Latin Squares
                   Our first application for this chapter deals with the structure called a Latin square. Such
                   configurations arise in the study of combinatorial designs and play a role in statistics — in
                   the design of experiments. We introduce the structure in the following example.

A petroleum corporation is interested in testing four types of gasoline additives to determine
EXAMPLE 17.13
                   their effects on mileage. To do so, a research team designs an experiment wherein four
                   different automobiles, denoted A, B, C, and D, are run on a fixed track in a laboratory. Each
                   run uses the same prescribed amount of fuel with one of the additives present. To see how
                   each additive affects each type of auto, the team follows the schedule in Table 17.3, where
                   the additives are numbered 1, 2, 3, and 4. This schedule provides a way to test each additive
                   thoroughly in each type of auto. If one additive produces the best results in all four types,
                   the experiment will reveal its superior capability.
                         The same corporation is also interested in testing four other additives developed for
                   cleaning engines. A similar schedule for these tests is shown in Table 17.4, where these
                   engine-cleaning additives are also denoted as 1, 2, 3, and 4.

Table 17.3                                                            Table 17.4

Day                                                                      Day
                        Auto | Mon    Tues     Wed               Thurs                     Auto | Mon          Tues     Wed =    Thurs

A        |     2          3               4                           A         ]       2          3      4
                         B       2      ]          4               3                           B         3       4           l     2
                         C      3       4           I              2                           C         4       3          2      1
                         D      4       3          2               1                           D         2       1          4      3

Furthermore, the research team is interested in the combined effect of both types of
                   additives. It requires 16 days to test the 16 possible pairs of additives (one for improved
                   mileage, the other for cleaning engines) in every automobile. If the results are needed in
                   four days, the research team must design the schedules so that every pair is tested once by
                   some auto. There are 16 ordered pairs in {1, 2, 3, 4} x {1, 2, 3, 4}, so this can be done in
                   the allotted time if the schedules in Tables 17.3 and 17.4 are superimposed to obtain the
                   schedule in Table 17.5. Here, for example, the entry (4, 3) indicates that on Tuesday, auto
                   C is used to test the combined effect of the fourth additive for improved mileage and the
                   third additive for maintaining a clean engine.

Table 17.5

Day
                                                   Auto           Mon      Tues          Wed       Thurs

(,1)    @,2)          3,3)         4,4
                                                        VAW YS

(2,3)   (1,4)         41)          G,2)
                                                                   (3,4)   (4,3)         d,.2)        @,)
                                                                   (4,2)   (3,1)         (2,4)        (1,3)
816               Chapter 17 Finite Fields and Combinatorial Designs

What has happened here leads us to the following concepts.

Definition 17.9             Ann Xn Latin square is a square array of symbols, usually 1, 2,3,...,,                 where each
                                   symbol appears exactly once in each row and each column of the array.

a) Tables 17.3 and 17.4 are examples of 4 X 4 Latin squares.
       EXAMPLE 17.14
                                      b) For all n > 2, we can obtain ann X n Latin square from the table of the group (Z,,, +)
                                         if we replace the occurrences of 0 by the value of n.

From the two Latin squares in Example 17.13 we were able to produce all of the ordered
                                   pairs in S X S, for S = {1, 2, 3, 4}. We now question whether or not we can do this for
                                   n X n Latin squares in general.

Definition 17.10            Let L; = (4,,), L2 = (b,,;) be two n X n Latin squares, where 1 <i, j <n             and each q;,,
                                   b,; €{1, 2,3,..., n}. If the n* ordered pairs (@,;, 5,;), | <i,j        <n, are distinct, then L),
                                   Ly are called a pair of orthogonal Latin squares.

a) There is no pair of 2 X 2 orthogonal Latin squares because the only possibilities are
       EXAMPLE 17.15
                                                                                 ]   2                2    1
                                                                       Li:       >   |   and   L>:    12

b) In the 3 X 3 case we find the orthogonal pair

Table 17.6                                                                   123                      1 2 3
                                                                Li;          2   3   1   and   Los:   3    1   2
   l         2      3      4                                                 3   1   2                23
                                                                                                       1
   4         3      2       l                              ;         ;
   3          ;     4      3          c) The two 4 X 4 Latin squares in Example 17.13 form an orthogonal pair. The 4 X 4
   3         4       \     2             Latin square shown in Table 17.6 is orthogonal to each of the Latin squares in that
                                         example.

We could continue listing some larger Latin squares, but we’ ve seen enough of them at
                                   this point to ask the following questions:

1) Is there any n > 2 for which there is no pair of orthogonal n x n Latin squares? If
                                           so, what is the smallest such n?
                                       2) For n > 1, what can we say about the number of n X n Latin squares that can be
                                          constructed so that each pair of them is orthogonal?
                                       3) Is there a method to assist us in constructing a pair of orthogonal x X n Latin squares
                                          for certain values of n > 2?

Before we can examine these questions, we need to standardize some of our results.

Definition 17.11            If L is an n Xn       Latin square, then L is said to be in standard form if its first row is
                                   1   2    3   ++         on,
                                                                                          173 Latin Squares          817

Except for the Latin square £2 in Example 17.15(a), all the Latin squares we’ ve seen in
                   this section are in standard form. If a Latin square is not in standard form, it can be put in
                   that form by interchanging some of the symbols.

The 5 X 5 Latin square shown in (a) is not in standard form. If, however, we replace each
   EXAMPLE 17.16
                   occurrence of 4 with 1, each occurrence of 5 with 4, and each occurrence of | with 5, then
                   the result is the (standard) 5 X 5 Latin square shown in (b).

42    3        5 1              12  3 4 5
                                                  35        4 2              5 3 4 1 2
                                             34    2         15              3 12    5 4
                                             25     1       3 4              24   5 3 1
                                             5 1 4          2 3              45  1 2 3
                                                 (a)                             (b)

It is often convenient to deal with Latin squares in standard form. But will this affect our
                   results on orthogonal pairs in any way?

THEOREM 17.14      Let L,, L2 be an orthogonal        pair of n X n Latin     squares. If L,, L2 are standardized as
                   L¥, L3, then L¥, L3 are orthogonal.
                   Proof: The proof of this result is left for the reader.

These ideas are needed for the main results of this section.

THEOREM 17.15      Inn € Z*,n > 2, then the largest possible number of n X n Latin squares that are ortho-
                   gonal in pairs isn — 1.
                   Proof: Let L,, L2,...,     Ly be & distinct n < n Latin squares that are in standard form and
                   orthogonal in pairs. We write a” to denote the entry in the ith row and jth column of
                   Lm, Where 1 <i,j <n, |<m<k. Since these Latin squares are in standard form, we
                   have a” = 1, al”        =2,..., and a\”” =n for all 1 < m <k. Now consider aS”, for all
                   1 <m     <k.   These   entries in the second row    and first column    are below    a”)   = |]. Thus
                   as   ; # 1, for all 1 <m<k,        or the configuration   is not a Latin square. Further, if there
                   exists |< @<m<k          with as? 7 as”, then the pair Ly, L,, cannot be an orthogonal pair.
                   (Why not?) Consequently, there are at best n — | choices for the a2; entries in any of our
                   n Xn Latin squares, and the result follows from this observation.

This theorem places an upper bound on the number of n X n Latin squares that are
                   orthogonal in pairs. We shall find that for certain values of n, this upper bound can be
                   attained. In addition, our next theorem provides a method for constructing these Latin
                   squares, though initially not in standard form. The construction uses the structure of a finite
                   field. Before proving this theorem for the general situation, however, we shall examine one
                   special case.

Let F ={f,]1
                           <i <5) =Zs with f/ —1,                     Pp =2, fp -3, fr
                                                                                     = 4, and fs =5,
                                                                                                  the zero
| EXAMPLE 17.17    of Zs.
818      Chapter 17 Finite Fields and Combinatorial Designs

For 1 < k <4, let L; be the5 Samay              ®)       where
                                                                                          | <i, / <5 and

= ffi t fj.
                             Whenk = 1, we construct L; = ue! ) as follows. Here ay, = fifi + f,= fi t+ fj, for
                           1<i,j<5. Withi = 1, the first row of L, is calculated as follows:

ay=Atfi=2                  ay=fitfh=3                      a, =fit f=
                                     a=      fit fr=s           av=fitfs=l
                          The entries in the second row of L, are computed when i = 2. Here we find

ay= f+ fi=                 ay = f+ f=                      a)= frt fr=5
                                     Me peget                   ays= fr + fs=2
                          Continuing these calculations, we obtain the Latin square L, as
                                                                 23       4        5      1
                                                                 3    4   5        1      2
                                                                 4    5   1        2      3
                                                                 5    1   2        3      4
                                                                 12       3        4      5

For k = 2, the entries of L> are given by the formula a,” = fof; + fi = 2f; + fj. To
                          obtain the first row of L2, we set i equal to | and compute

ay =2fAtfi=3                 aS =2fAt+h=4                     a =2f,
                                                                                                     + fr =5
                                   a =2fit fr=l                 ay? =2fit fs=2
                           When i is set equal to 2, the entries in the second row of L2 are calculated as follows:

ay) =2f+ fi=5                a                                aS) =2fr+ fr=2
                                   ay? =2f2+ fa =3              ay=2fo+ fs=
                           Similar calculations fori = 3, 4, and 5 result in the Latin square L> given by
                                                                 3.4
                                                                   5 1 2
                                                                 5    1   2        3      4
                                                                 23       4        5 #1
                                                                 4    5   1        2      3
                                                                  1   2   3        4      5
                              It is straightforward to check that the two Latin squares L; and L2 are orthogonal. In
                           Exercise 5 (at the end of this section) the reader will be asked to calculate L3 and £4. Our
                           next result will verify that the four arrays L,, L2, £3, and £4 are Latin squares and that they
                           are orthogonal in pairs.

THEOREM 17.16              Letn € Z*,n > 2.1f pisaprime andn = p’', fort € Z*, then there are n — | Latin squares
                           that are n X n and orthogonal in pairs.
                           Proof: Let F = GF (p’'), the Galois field of order p’ = n. Consider F = {f\, fo, ..., fr},
                           where /; is the unity and f,, is the zero element.
                                                                                                                         123. Latin Squares             819

We construct n — 1 Latin squares as follows.
                                  For each | <k <n—1, let Ly be the n Xn array (aj), 1 <i,j <n, where a tf(kK)
                             Safi + fi.
                                First we show that each L, is a Latin square. If not, there are two identical elements of
                             F in the same row or column of L;. Suppose that a repetition occurs in a column — that is,
                             ay        = ay? for 1 <r,s <n. Then ay                        =f            t+ fi =hihs t+ fi = ay. This implies that
                             Si fr = Sf, by the cancellation for addition in F. Since k # n, it follows that f, # fy, the
                             zero of F. Consequently, f;, is invertible, so f, = f, andr = s. A similar argument shows
                             that there are no repetitions in any row of Ly.
                                 At this point we have n — | Latin squares, £1, L2,..., L,-1. Now we shall prove that
                             they are orthogonal in pairs. If not, let 1 < k <m <n —1 with
                                        (kK)       _   ok         (m)   __                      -   os                                -   os
                                       a;          = al,        qj,”    =al™,              l<i,j,rns<n,                  and         G, jf) #58).

(Then the same ordered pair occurs twice when we superimpose L,; and L,,.) But
                                                                    k                 .
                                                                ay =a => fifit fj = fifi + fe,                                 and
                                                               ayy = al <> finfir + £5 = Snfir + fi
                             Subtracting               these   equations,    we     find    that         (f; — fa) fi = (hk - fn) fr.          With   k Am,
                             (fi — fin) iS not the zero of F,, so it is invertible and we have f; = f,. Putting this back into
                             either of the prior equations, we find that f; = f,. Consequently, = r and j = s. Therefore
                             for k # m, the Latin squares L; and L,, form an orthogonal pair.

The first value of n that is not a power of a prime is 6. The existence of a pair of 6 X 6
                             orthogonal Latin squares was first investigated by Leonhard Euler (1707-1783) when he
                             sought a solution to the “problem of the 36 officers.” This problem deals with six different
                             regiments wherein six officers, each with a different rank, are selected from each regiment.
                             (There are only six possible ranks.) The objective is to arrange the 36 officers in a 6 X 6
                             array so that in each row or column of the array, every rank and every regiment is represented
                             exactly once. Hence each officer in the square array corresponds to an ordered pair (i, j)
                             where | <i, / < 6, with: for his regiment and ; for his rank. In 1782 Euler conjectured that
                             the problem could not be solved    — that there is no pair of 6 X 6 orthogonal Latin squares.
                             He went further and conjectured that for all n € Z*, if n = 2 (mod 4), then there is no pair
                             of n X n orthogonal Latin squares. In 1900 G. Tarry verified Euler’s conjecture for n = 6
                             by a systematic enumeration of all possible 6 < 6 Latin squares. However, it was not until
                             1960, through the combined efforts of R. C. Bose, S. S. Shrikhande, and E. T. Parker, that
                             the remainder of Euler’s conjecture was proved false. They showed that if n € Z* with
                             n = 2 (mod 4) and n > 6, then there exists a pair of n X n orthogonal Latin squares.
                                For more on this result and Latin squares in general, the reader should consult the chapter
                             references.

b) Finda4 X 4 Latin square in standard form that is orthog-
                     A       Se                                                     onal to the result in part (a).

1. a) Rewrite the following 4 < 4 Latin square in standard                           c) Apply the reverse of the process in part (a) to the result
  form.                                                                             in part (b). Show that your answer is orthogonal to the given
                         1   3    4            2                                    4 x 4 Latin square.
                         3   1
                                  24                                              2. Prove Theorem 17.14.
                         2   4    3            1
                         4   7     1           3                                  3. Complete the proof of the first part of Theorem 17.16.
820           Chapter 17 Finite Fields and Combinatorial Designs

4. The three 4 * 4 Latin squares in Tables 17.3, 17.4, and 17.6             8. A Latin square L is called self-orthogonal if L and its trans-
are orthogonal in pairs. Can you find another 4 X 4 Latin square            pose L" form an orthogonal pair.
that is orthogonal to each of these three?                                     a) Show that there is no 3 X 3 self-orthogonal Latin square.
5. Complete the calculations in Example 17.17 in order to ob-                  b) Give an example of a 4 X 4 Latin square that is self-
tain the two 5 X 5 Latin squares L3 and L4. Rewrite each Latin
                                                                               orthogonal.
square L,, for 1 <i <4, in standard form.
                                                                               c) If L = (a,,) is ann Xn      self-orthogonal    Latin square,
6. Find three 7 X 7 Latin squares that are orthogonal in pairs.
                                                                               prove that the elements   a,,, for 1 <7   <n,   must   all be dis-
Rewrite these results in standard form.
                                                                               tinct.
7. Extend the experiment in Example 17.13 so that the research
team needs three 4 X 4 Latin squares that are orthogonal in
pairs.

17.4
      Finite Geometries and Affine Planes
                               In the Euclidean geometry of the real plane, we find that (a) two distinct points determine a
                               unique line and (b) if 2 is a line in the plane, and P a point not on @, then there is a unique line
                               ’ that contains P and is parallel to £. During the eighteenth and nineteenth centuries, non-
                               Euclidean geometries were developed when alternatives to condition (b) were investigated.
                               Yet all of these geometries contained infinitely many points and lines. The notion of a finite
                               geometry did not appear until the end of the nineteenth century in the work of Gino Fano
                               (Giornale di Matematiche,           1892).
                                  How can we construct such a geometry? To do so, we return to the more familiar Eu-
                               clidean geometry. In order to describe points and lines in this plane algebraically, we intro-
                               duced a set of coordinate axes and identified each point P by an ordered pair (c, @) of real
                               numbers. This description set up a one-to-one correspondence between the points in the
                               plane and the set R X R. By using the idea of slope, we could uniquely represent each line
                               in this plane by either (1) x = a, where the slope is infinite, or (2) y = mx + b, where m is
                               the slope; a, m, and b are real numbers. We also found that two distinct lines are parallel if
                               and only if they have the same slope. When their slopes are distinct, the lines intersect ina
                               unique point.
                                    Instead of using real numbers           a, b, c, d, m    for the point    (c, d) and the lines x =a,
                               y = mx + b, we now turn to a comparable finite structure, the finite field. Our objective is
                               to construct what is called a (finite) affine plane.

Definition 17.12         Let ? be a finite set of points, and let & be a set of subsets of %, called lines. A (finite)
                               affine plane on the sets P and & is a finite structure satisfying the following conditions.
                                    A1) Two distinct points of ? are (simultaneously) in only one element of &; that is, they
                                        are on only one line.
                                    A2) For each € € &, and each P € ? with P ¢ £, there exists a unique element @’ € £
                                        where P ¢€ @’ and @, €’ have no point in common.
                                    A3)   There are four points in ?, no three of which are collinear (that is, no three of these
                                          four points are in any one of the subsets € € &).

The reason for condition (A3) is to avoid uninteresting situations like the one shown in
                               Fig. 17.1. If only conditions (A1) and (A2) were considered, then this system would be an
Figure 17.1                    affine plane.
                                               17.4 Finite Geometries and Affine Planes        821

We return now to our construction. Let F = GF (n), where n = p' for some prime p and
t € Z*, In constructing our affine plane, denoted by AP(F), we let P = {(c, d)|c, d € F}.
Thus we have n? points.
   How many lines should we have for the set £?
   The lines fall into two categories. For a line of infinite slope the equation is x = a, where
a € F. Thus we have n such “vertical lines.” The other lines are given algebraically by
y = mx + b, where m, b € F. With n       choices for each of m and b, it follows that there are
n? lines that are not “vertical.” Hence |£| =n? +n.
    Before we verify that AP(F), with ? and & as constructed, is an affine plane, we make
two other observations.
    First, for each line ¢ € &, if & is given by x =a, then there are n choices for y on
€ = {(a, y)|y € F}. Thus @ contains exactly n points. If £2 is given by y = mx + b, for
m, b € F, then for each choice of x we have y uniquely determined, and again @ consists
of n points.
    Now consider any point (c, d) € 9%. This point is on the line x = c. Furthermore, on each
line y = mx + bof finite slope m, d — mc uniquely determines b. With n choices for m, we
see that the point (c, d) is on the nv lines of the form y = mx + (d — mc). Overall, (c, d) is
onn + | lines.
   Thus far in our construction of AP(F) we have a set            of points and a set & of lines
where (a) |P| = n7; (b) |L| = n* +n; (c) each & € ¥ contains n points; and (d) each point
in ? is on exactly n + 1 lines. We shall now prove that AP (F’) satisfies the three conditions
to be an affine plane.
   Al) Let (c, d), (e, f) € P. Using the two-point formula for the equation of a line, we
       have

(e—c){(y    —-d) =(f —d)(x —c)                                (1)
        as a line on which we find both (c, d) and (e, f). Each of these points is on # + 1
        lines. Could there be a second line containing both of them?
           The    point (c, d) is on the line x = c. If (e, f) is also on this line, then e =
        c, but f #d because the points are distinct. With e = c, Eq. (1) reduces to 0 =
        (f — d)(x —c), orx      = c because f — d # 0, and so we do not have a second line.
           With c # e, if (c, d), (e, f) are on a second line of the form y = mx + b,
        thend = mc+b,       f =me+b,and(f         — d) = m(e —c). Ourcoefficients are taken
        froma fieldande # c,som = (f —d)(e —c)”"! andb =d—mc=d-—(f                           —d)-
        (e — c)~'c. Consequently, this second line containing (c, d) and (e, f) is

y=(f —dy(e-c)'x +[d-(f —d)\(e-0)7'c]
        or, because multiplication in F is commutative, (e — c)(y — d) = (f —d)(x —c),
        which is Eq. (1). Thus two points from       are on only one line, and condition (A1)
        is satisfied.
   A2) To verify this condition, consider the point P and the line @ as shown in Fig. 17.2.
       Since there are n points on any line, let P|, P2,..., P, be the points of 2. (These
       are the only points on @, although the figure might suggest others.) The point P is
        not on £, so P and P; determine a unique line £;, for each         1 <i <n. We showed
        earlier that each point is on n + | lines, so now there is one additional line @’ with
        P on é’ and with £’ not intersecting £.
   A3) The last condition uses the field F. Since |#{ > 2, there is the unity             | and the
        zero element O in F. Considering the points (0, 0), (1, 0), (0, 1), (1, 1), if line 2
822         Chapter 17 Finite Fields and Combinatorial Designs

Figure 17.2

contains any three of these points, then two of the points have the form (c, c), (c, @).
                                       Consequently the equation for £ is given by x = c, which is not satisfied by either
                                       (d, c) or (d, d). Hence no three of these points are collinear.
                                          We have now shown the following.

THEOREM 17.17                If F is a finite field, then the system based on the set ? of points and the set & of lines, as
                             described above, is an affine plane denoted by AP(F).

Some particular examples will indicate a connection between these finite geometries, or
                             affine planes, and the Latin squares of the previous section.

For F = (Zo, +, +), we have n = | Fj = 2. The affine plane in Fig. 17.3 has n* = 4 points
      EXAMPLE 17.18
                             and n? +n     =6   lines. For example, the line £4 = {(1, 0), (1, 1)}, and £4 contains no other
                             points that the figure might suggest. Furthermore, £5 and £¢ are parallel lines in this finite
                             geometry because they do not intersect.

(0,1)      (1,1)        ,

3                      £4

(0,0)      (1,0)        &

Figure 17.3

Let F = GF(2*) — the field of Example 17.9. Recall the notation of Example           17.11(d) and
      EXAMPLE 17.19
                             write F = {00, 01, 10, 11}, with addition and multiplication given by Table 17.7. We use
                             this field to construct a finite geometry with n? = 16 points and n? + n = 20 lines. The 20
                             lines can be partitioned into five parallel classes of four lines each.

Class 1: Here we have the lines of infinite slope. These four “vertical” lines are given
                                 by the equationsx = 00,x = O01,x = 10, andx = II.
                                 Class 2: For the “horizontal” class, or class of slope 0, we have the four lines y = 00,
                                 y = 01, y = 10, and y = 11.
                                                                     17.4 Finite Geometries and Affine Planes        823

Table 17.7

+        00       01          10          11                              .     00       01   10      11
         00        00       01          10          11                          00        00       00   00      00
         01        01       00           11         10                          Ol        00       01   10      11
         10        10       11          00          01                          10        00       10   11      Ol
         11        11       10          01          00                          ll        00       11   01      10

Class 3: The lines with slope 01 are those whose equations are y = Olx + 00, y =
   Olx +01, y=Olx           + 10, and y = Olx +11.
   Class 4: This class consists of the lines with equations y = 10x + 00, y = 10x + O1,
   y = 10x + 10, and y = 10x + 11.
   Class 5: The last class contains the four lines given by y = 1lx + 00, y = 1lx + 01,
   y = 11x +10, andy = Illx +11.
   Since each line in A P(F’) contains four points and each parallel class contains four lines,
we shall see now how three of these parallel classes partition the 16 points of AP(F).

NY   (00,11)       (01,11)            (10,11)          (11,11)

Ys
                                      (00,10)       (01,10)
                                                                            YS
                                                                       (10,10)          (11,10)

2                   1              4              3
                                      (00,01)       (01,01)            (10,01)          (11,01)

po
                                         YN   00)   (01,00)
                                                                            YS
                                                                       (10,00)          (11, mo)

Figure 17.4

For the class with m = O1, there are four lines: (1) y = Olx + 00; (2) y = Olx + O1;
(3) y = Olx + 10; and (4) y = Olx + 11. Above each point in A P(F) we write the number
corresponding to the line it is on. (See Fig. 17.4.) This configuration can be given by the
following Latin square:
                                                                 3     2        1
                                                     +

4      1       2
                                                     Ww

]     4        3
                                                     Ny

1       2     3        4
   lf we repeat this process for classes 4 and 5, we get the partitions shown in Figs. 17.5
and 17.6, respectively. In each class the lines are listed, for the given slope, in the same
order as for Fig. 17.4. Within each figure is the corresponding Latin square.
   These figures give us three 4 X 4 Latin squares that are orthogonal in pairs.
824                  Chapter 17 Finite Fields and Combinatorial Designs

4             2             1            3                                                   4             1           3            2
   e             e             e            e                                                   e             e           ®            e
(00,11)       (01,11)       (10,11)      (11,11)                                             (00,11)       (01,11)     (10,11)      (11,11)

3              1            2            4                                                   3             2           4            1
   e             e             e            e                                                   e             e           e            e
(00,10)       (01,10)       (10,10)      (11,10)                                             (00,10)       (01,10)     (10,10)      (11,10)

2             4             3            1                                                   2             3           1            4
   e             e             e            @                                                   e             e           e            e
(00,01)       (01,01)
               01,01        (10,01)
                             10,01       (11,01)
                                          11,01              4213                            (00,01)1   ( 01,01 )       10,01 )     (11,01)
                                                                                                                                     11,01          4132
                                                             3124                                                                                   3 2    4 1
                                                                                                 1            4           2            3
   i             3             :            2                243         1                      e             e           e            e            23     14
(00,00)       (01,00)       (10,00)      (11,00)             13    4     2                   (00,00)       (01,00)     (10,00)      (11,00)          1 4   2 3
Figure 17.5                                                                                  Figure 17.6

The results of this example are no accident, as demonstrated by the following theorem.

THEOREM 17.18                            Let F = GF(n),           where n > 3 andn       = p', pa prime, t € Z*. The Latin squares that arise
                                         from AP(F) for the n — 1 parallel classes, where the slope is neither 0 nor infinite, are
                                         orthogonal in pairs.
                                         Proof: A proof of this result is outlined in the Section Exercises.

EXERCISES 17.4

1. Complete the following table dealing with affine planes.

Number        of               Number of
                 Field                Number     of Points             Number of Lines          Points on a Line                 Lines on a Point

25

GF(3”)

56

17

31

2. How many parallel classes do each of the affine planes in                             c) The line in AP(F), where F = GF (27), that is parallel
Exercise | determine? How many lines are in each class?                                  to 10y = 11x + 01 and contains (11, 01). (See Table 17.7.)

3. Construct the affine plane AP(Z3).                Determine its parallel         6. Suppose we try to construct an affine plane AP(Z,) as we
classes and the corresponding Latin squares for the classes of                      did in this section.
finite nonzero slope.                                                                    a) Determine which of the conditions (A1), (A2), and (A3)
                                                                                         fail in this situation.
4. Repeat Exercise 3 with Zs taking the place of Zs.
                                                                                         b) Find how many lines contain a given point P and how
5. Determine each of the following lines.                                                many points are on a given line @, for this “geometry.”
   a) The line in AP(Z-;) that is parallel to y = 4x +2                      and    7. The following provides an outline for a proof of Theorem
   contains (3, 6).                                                                 17.18.
   b) The line in AP(Z,)               that is parallel to 2x + 3y +4=0                  a) Consider a parallel class of lines given by y = mx +b,
   and contains (10, 7).                                                                 where m € F, m # 0. Show that each line in this class inter-
                                                                              175 Block Designs and Projective Planes         825

sects each “vertical” line and each “horizontal” line in ex-         slope, are orthogonal, assume that an ordered pair (i, j) ap-
actly one point of AP(F). Thus the configuration obtained            pears more than once when one square is superimposed upon
by labeling the points of AP(F), as in Figs. 17.4, 17.5, and         the other. How does this lead to a contradiction?
17.6, is a Latin square.
b) To show that the Latin squares corresponding to two dif-
ferent classes, other than the classes of slope 0 or infinite

17.5
Block Designs and Projective Planes
                            In this final section, we examine a type of combinatorial design and see how it is related to
                            the structure of a finite geometry. The following example will illustrate this design.

EXAMPLE      17.20         Dick (d) and his wife Mary (m) go to New York City with their five children — Richard (r),
                            Peter (p), Christopher (c), Brian (b), and Julie (j). While staying in the city they receive
                            three passes each day, for a week, to visit the Empire State Building. Can we make up a
                            schedule for this family so that everyone gets to visit this attraction the same number of
                            times?
                                The following schedule is one possibility.
                                 1) b,c,d              2) b, j,r               3) b, m, p                4) c,j,m
                                5)   c, p,r            6)   d,j, p             7)   d,m,r

Here the result was obtained by trial and error. For a problem of this size such a technique
                            is feasible. However, in general, a more effective strategy is needed. Furthermore, in asking
                            for a certain schedule, we may be asking for something that doesn’t exist. In this problem,
                            for example, each pair of family members is together on only one visit. If the family had
                            received four passes each day, we would not be able to construct a schedule that maintained
                            this property.

The situation in this example generalizes as follows.

Definition 17.13          Let V be a set with v elements. A collection {B|, Bo, ..., B,} of subsets of V is called a
                            balanced incomplete block design, or (v, b, r, k, 4)-design, if the following conditions are
                            satisfied:

a) For each 1 <i   < b, the subset B, contains k elements, where k is a fixed constant and
                                   kK<uv.
                                b) Each element x € V is inr (< b) of the subsets B;, 1 <i <b.
                                c) Every pair x, y of elements of V appears together in A (<b)                of the subsets B;,
                                   l<i<b.

The elements of V are often called varieties because of the early applications in the design
                           of experiments that dealt with tests on fertilizers and plants. The b subsets B,, B2,..., By
                           of V are called blocks, where each block contains k varieties. The number r is referred to as
                           the replication number of the design. Finally, 4 is termed the covalency for the design. This
                           parameter makes the design balanced in the following sense. For general block designs we
                            have a number A,, for each pair x, y ¢ V; if A,, is the same for all pairs of elements from
826         Chapter 17 Finite Fields and Combinatorial Designs

V, then A represents this common measure and the design is called balanced. In this text
                             we only deal with balanced designs.

EXAMPLE   17.21           a) The schedule in Example 17.20 is an example of a (7, 7, 3, 3, 1)-design.
                                b) For V = {1, 2, 3, 4, 5, 6}, the ten blocks

12      4            1      3    4               1   5         6            2        3     6   3   4   6
                                            12      6            1      3    5               2   3         5            2        4°5       4

constitute a (6, 10, 5, 3, 2)-design.
                                c) If F is a finite field, with | F| = a, then the affine plane AP(F) yields an
                                      (n*?,ne?tn ntl,n,              1)-design. Here the varieties are the n? points in AP(F);                         the
                                      n* +n lines are the blocks of the design.

At this point there are five parameters determining our design. We now examine how
                             these parameters are related.

THEOREM     17.19            Fora (v, b, r, k, 4)-design, (1) vr = bk and (2) A(v — 1) = r(k — 1).
                             Proof:

1) With 4 blocks in the design and & elements per block, listing all the elements of the
                                    blocks, we get bk symbols. This collection of symbols consists of the elements of V
                                    with each element appearing r times, for a total of vr symbols. Hence vr = bk.
                                 2) For this property we introduce the pairwise incidence matrix A for the design. With
                                       |V| = v, let t = (5), the number of pairs of elements in V. We construct the ¢ X b
                                       matrix A = (a;;) by defining a,;; = | if the ith pair of elements from V is in the jth
                                       block of the design; if not, aj; = 0.

B,            Bp                    By

X1X2          ay]               a12       vt            ap
                                                                     X1X3          a2                a22            a        arp

NX Xy        Gy-1]          Gy-12         ***          Gy-1b
                                                                     X2X3          ay |              ay?       cee           avb

Xy—1Xy              Lr 1                a2             a        arb       J

We now count the number of |’s in matrix A in two ways.

a) Consider the rows. Since each pair x;, x;, for 1 <i < j < v, appears in A blocks, it
                                      follows that each row contains 2 1’s. With t rows in the matrix, the number of 1’s is
                                      then At = Av(v — 1)/2.
                               b) Now consider the columns. As each block contains k elements, this determines () =
                                      k(k — 1)/2 pairs, and this is the number of 1’s in each column of matrix A. With b
                                      columns, the total number of 1’s is bk(k — 1)/2.

Then, Av(v — 1)/2 = bk(k — 1)/2 = or(k — 1)/2, so A(v — 1) = r(k — 1).
                                                                   175 Block Designs and Projective Planes       827

As we mentioned earlier, when n is a power of a prime, an (n?, n? +n,n+1,n, 1)-
                   design can be obtained from the affine plane AP(¥), where F = GF (n). Here the points
                   are the varieties and the lines are the blocks. We shall now introduce a construction that
                   enlarges A P(F’) to what is called a finite projective plane. From this projective plane we can
                   construct an (n?7 +nr+1,n7+n+1,n+1,n+ 1, 1)-design. First let us see how these
                   two kinds of planes compare.

Definition 17.14   If P’ is a finite set of points and &’ a set of lines, each of which is a nonempty subset of
                   ’, then the (finite) plane based on 9’ and £&’ is called a projective plane if the following
                   conditions are satisfied.
                      P1) Two distinct points of ’ are on only one line.
                      P2) Any two lines from &’ intersect in a unique point.
                      P3) There are four points in 9’, no three of which are collinear.

The difference between the affine and projective planes lies in the condition dealing with
                   the existence of parallel lines. Here the parallel lines of the affine plane based on    and &
                   will intersect when the given system is enlarged to the projective plane based on 9’ and L’.
                       The construction proceeds as follows.

Start with an affine plane AP(F)      where   F = GF(n).     For each point (x, y) € X, rewrite
EXAMPLE 17.22
                   the point as (x, y, 1). We then think of the points as ordered triples (x, y, z) where z = 1.
                   Rewrite the equations of the lines x = c and y= mx +b in AP(F) as x =cz and y =
                   mx + bz, where z = |. We still have our original affine plane A P(/’), but with a change of
                   notation.
                       Add the set of points {(1, 0, 0)} U {(x, 1, 0)|x € F} to P to get the set P’. Then |P’| =
                   n> +n + 1. Let £., be the subset of ’ consisting of these new points. This new line can be
                   given by the equation z = 0, with the stipulation that we never have x = y = z = 0. Hence
                   (0, 0, 0) ¢ FP’.
                       Now let us examine these ideas for the affine plane A P(Z2). Here ? = {(0, 0), (1, 0),
                   (0, 1), (1, 1)}, so

P' = {(0, 0, 1), U0, 1), 0, 1, 1), C1, 1, 1D}U fC, 0, 0), CO. 1, 0), C1, 1, 0}.
                   The six lines in & were originally

x =0:{0,0),0,D}               y= 0:{,0), 1,0}               y= x: (0, 0), 1, 1D}
                        x=1:{0,0,0,D}                 y=l:{O,D,d.D}                   y=x4th  {O, 1, 1.09}
                      We rewrite these as

x =0          y=0       yHx         xX =Z       y=2z         yHxt+z

and add a new line £,, defined by z = 0. These constitute the set £’ of lines for our projective
                   plane. And now at this point we consider z as a variable. Consequently, the line x = z
                   consists of the points (0, 1, 0), (1, 0, 1), and (1, 1, 1). In fact, each line of & that contained
828   Chapter 17 Finite Fields and Combinatorial Designs

two points will now contain three points when considered in L’. The set L’ consists of the
                       following seven lines.
                         x = 0: {(0, 0, 1), (0, 1, 0), ,         1, 1D}           y=z:{d,         0,0), (0, 1, 1), Cl, 1, 1}
                         y = 0: {(0, 0, 1), C1, 0, 0), C1, 0, 1D}                 y =x: {(0,0, 1), Cd, 1,0), Cl, 1, 1}
                         x =z: {(0, 1,0), (1, 0, 1), (1. 1, 1D}                   y=x4tz2:{0,          1, D, d, 1,0), 1, 0, 1}
                         z= 0 (£20): {C1, 0, 0), (, 1, 0), CL, 1, 0)}
                           In the original affine plane the lines x = 0 and x = 1 were parallel because no point
                       in this plane satisfied both equations simultaneously. Here in this new system x = 0 and
                          = z intersect in the point (0, 1, 0), so they are no longer parallel in the sense of A P(Z3).
                       Likewise, y = x and y = x + 1 were parallel in AP(Z.), whereas here the lines y = x
                       and y = x + z intersect at (1, 1, 0). We depict this projective plane based on #’ and £’ as
                       shown in Fig. 17.7. Here the “circle” through (1, 0, 1), (1, 1, 0), and (0, 1, 1) is the line
                       y = x + z. Note that every line intersects &,,, which is often called the /ine at infinity. This
                       line consists of the three points at infinity. We define two lines to be parallel in the projective
                       plane when they intersect in a point at infinity (or on £,,).

(1,0,0)            (0,1,1)            | (11,1)
                                                                z=   OE...)             YTrZl   X=Z

Figure 17.7

This projective plane provides us with a (7, 7, 3, 3, 1)-design like the one we developed
                        by trial and error in Example 17.20.

We generalize the results of Example 17.22 as follows: Let n be a power of a prime. The
                        affine plane AP(F), for F = GF(n), provides an example of an (n”, n? +n, n+ 1,n, 1)-
                        design. In A P(F)     the n? +n       lines fall into n + 1 parallel classes. For each parallel class
                        we add a point at infinity to AP(F’). The point (0, 1, 0) is added for the class of lines
                       x = cz, c € F; the point (1, 0, 0) for the class of lines y = bz, b € F. When                  m € F and
                        m # 0, then we add the point (m~!, 1, 0) for the class of lines y = mx + bz, b € F. The
                        line at infinity, @.,, is then defined as the set of n + 1 points at infinity. In this way we
                        obtain the projective plane over GF (n), which has n? + n + | points and n? + n + 1 lines.
                        Here each point is on n + 1 lines, and each line contains n + 1 points. Furthermore, any
                        two points in this plane are on only one line. Consequently, we have an example of an
                        (Wr tnati,nt+n4t1,n+1,n41, 1)-design.
                                                                                     175 Block Designs and Projective Planes            829

b) ? ={(x, y,2Ix, vy, ze R} =R
                           EXERCISES 17.5                                      ' is the set of all lines in R°.
  1, Let V = {1, 2,..., 9}. Determine the values of v, b, r, k,             c)    ’ is the set of all lines in R? that pass through (0, 0, 0).
and A for the design given by the following blocks.                              £ is the set of all planes in R? that pass through
                                                                                 (0, 0, 0).
    126          147       234           279       378         468
                                                                       11. Bowling teams of five students each are formed from aclass
    135          189       258           369       459         567     of 15 college freshmen. Each of the students bowls on the same
2. Find an example of a (4, 4, 3, 3, A)-design.                       number of teams; each pair of students bowls together on two
3. Find an example of a (7, 7, 4, 4, 4)-design.                       teams. (a) How many teams are there in all? (b) On how many
4. Complete the following table so that the parameters v, b,          different teams does each student bowl?
r,k, A in any row may be possible for a balanced incomplete
                                                                       12, Mrs. Mackey gave her computer science class a list of 28
block design.
                                                                       problems and directed each student to write algorithms for the
                                                                       solutions of exactly seven of these problems. If each student did
                   v       b         r         k         A             as instructed and if for each pair of problems there was exactly
                                                                       one pair of students who wrote algorithms to solve them, how
                  4                            3         2             many students did Mrs. Mackey have in her class?

9       12                  3                       13. Consider a (v, b, r, k, 4)-design on the set V of varieties,
                                                                       where |V| = v > 2. If x, y © V, how many blocks in the design
                  10                 9                   2             contain either x or y?

13                 4         4                       14, In a programming class Professor Madge has a total of n
                                                                       students, and she wants to assign teams of m students to each
                           30       10                   3
                                                                       of p computer projects. If each student must be assigned to the
                                                                       same number of projects, (a) in how many projects will each
5. Is it possible to have a (v, b, r, k, 4)-design where              individual student be involved? (b) in how many projects will
(a)b = 28,r     =4,k    = 3? (b)v         =17,r=8,k      =5?           each pair of students be involved?
6. Given a    (v, b, r, k, 4)-design with b = v, prove that if v is
even, then A is even.                                                  15. a) If a projective plane has six lines through every point,
                                                                           how many points does this projective plane have in all?
7. A (v, b, r, k, A)-design is called a triple system if k = 3.
When & = 3 and A = 1, we call the design a Steiner triple                  b) If there are 57 points in a projective plane, how many
system.                                                                    points lie on each line of the plane?

a) Prove that in every triple system, A(v — 1) is even and         16. In constructing the projective plane from AP(Z>) in Ex-
    Av(v — 1) is divisible by 6.                                       ample 17.22, why didn’t we want to include the point (0, 0, 0)
                                                                       in the set P’?
    b) Prove that in every Steiner triple system, v is congruent
    to | or 3 modulo 6.                                                17. Determine the values of v, b, r, k, and 4 for the balanced
8. Verify that the following blocks constitute a Steiner triple       incomplete block design associated with the projective plane
system on nine varieties.                                              that arises from AP(F) for the following choices of F: (a) Zs
   128          147       234        279           389         468     (b) Z; (c) GF (8).

135          169       256        367           459         578     18. a) List the points and lines in A P(Z3). How many paral-
9, In a Steiner triple system with b = 12, find the values of v           lel classes are there for this finite geometry? What are the
and r.                                                                     parameters for the associated balanced incomplete block
                                                                           design?
10. In each of the following, ?’ is a set of points and &’ a set
of lines, each of which is a nonempty subset of 9’. Which of               b) List the points and lines for the projective plane that
the conditions (P1), (P2), and (P3) of Definition 17.14 hold for           arises from A P(Z3). Determine the points on &,,, and use
the given P’ and £’?                                                       them to determine the “parallel” classes for this geometry.
                                                                           What are the parameters for the associated balanced incom-
    a) ?' = {a, b, c}
                                                                           plete block design?
          £' = {{a, b}, {a, c}, {b, c}}
830      Chapter 17 Finite Fields and Combinatorial Designs

17.6
      Summary and Historical Review
                          The structure of a field was first developed in Chapter 14. In this chapter we examined
                          polynomial rings and their role in the structure of finite fields, directing our attention to
                          applications in finite geometries and combinatorial designs.
                             In Chapter 15 we saw that the order of a finite Boolean algebra could only be a power
                          of 2. Now we find that for a finite field the order can only be a power of a prime and that
                          for each prime p and each n € Z", there is only one field, up to isomorphism, of order p”.
                          This field is denoted by GF(p"), in honor of the French mathematician Evariste Galois
                          (1811-1832).

M6

Evariste Galois (1811-1832)

The finite fields (Z,, +, +), for p a prime, were obtained in Chapter 14 by means of
                           the equivalence relation, congruence modulo p, defined on Z. Using these finite fields, we
                           developed here the integral domains Z,,[x]. Then, with s(x) an irreducible polynomial of
                           degree n in Z,,[x], a similar equivalence relation— namely, congruence modulo s(x) —
                           gave us a set of p” equivalence classes, denoted Z,,[x]/(s(x)). These p” equivalence classes
                           became the elements of the field GF (p”). (Although we did not prove every possible result
                           in general, it can be shown that over the finite field Z,,, there is an irreducible polynomial
                           of degree n for each n € Z*.)
                               The theory of finite fields was developed by Galois in his work addressing the problem of
                           the solutions of polynomial equations. As we mentioned in the summary of Chapter 16, the
                           study of polynomial equations was an area of research that challenged many mathematicians
                           from the sixteenth to the nineteenth centuries. In the nineteenth century, Niels Henrik Abel
                           (1802-1829) first showed that the solution of the general quintic could not be given by
                           radicals. Galois showed that for any polynomial of degree n over a field F, there is a
                           corresponding group G that is isomorphic to a subgroup of S,,, the group of permutations
                           of {1, 2, 3,..., m}. The essence of Galois’s work is that such a polynomial equation can be
                           solved by (addition, subtraction, multiplication, division, and) radicals if its corresponding
                           group is solvable. Now what makes a finite group solvable? We say that a finite group G is
                           solvable if it has a chain of subgroups G = K, D K7 D K3D---+ DK; = {e}, where for all
                                                                                 References         831

2 <i <t, K; isanormal subgroup of K;_, (that is, xyx! € K; for all y € K; and for all
     x € K;_,), and the quotient group K;_,/K, is abelian. One finds that all subgroups of S;,
     for | <i <4, are solvable, but for n > 5 there are subgroups of S,, that are not solvable.
         Though it seems that Galois theory is concerned predominantly with groups, there is
     a great deal more on the theory of fields that we have not mentioned. As a consequence
     of Galois’s work, the areas of field theory and finite group theory became topics of great
     mathematical interest.
         For more on Galois theory, the reader will find Chapter 6 of the text by V. H. Larney
     [8] and Chapter 12 in the book by N. H. McCoy and T. R. Berger [10] good places to start.
     Chapter 5 of I. N. Herstein [6] has more on the topic, while a detailed presentation can be
     found in the text by S. Roman [11] and the classic work by O. Zariski and P. Samuel [17].
     Appendix Ein the text by V. H. Larney [8] includes an interesting short account of the life of
     Galois; more on his life can be found in the somewhat fictional account by L. Infeld [7]. The
     article by T. Rothman [12] provides a more contemporary discussion of the inaccuracies
     and myths surrounding the life, and especially the death, of Galois. The biographical notes
     on pages 287-291 of the text by J. Stillwell [14] relate more on the life and work of this
     great gemius.
         The Latin squares, combinatorial designs, and finite geometries of the later sections of the
     chapter showed us how the finite field structure entered into problems of design. Dating back
     to the time of Leonhard Euler (1707-1783) and the problem of the “36 officers,” the study
     of orthogonal Latin squares has been developed considerably since 1900, and especially
     since 1960 with the work of R. C. Bose, S. S. Shrikhande, and E. T. Parker. Chapter 7 of
     the monograph by H. J. Ryser [13] provides the details of their accomplishments. The text
     by C. L. Liu [9] includes ideas from coding theory in its discussion of Latin squares.
         The study of finite geometries can be traced back to the work of Gino Fano, who, in
     1892, considered a finite three-dimensional geometry consisting of 15 points, 35 lines, and
     15 planes. However, it was not until 1906 that these geometries gained any notice, when
     O. Veblen and W. Bussey began their study of finite projective geometries. For more on this
     topic, the reader should find the texts by A. A. Albert and R. Sandler [1] and H. L. Dorwart
     [4] very interesting. The text by P. Dombowski [3] provides an extensive coverage for those
     seeking something more advanced.
         Finally, the notion of designs was first studied by statisticians in the area called the design
     of experiments. Through the research of R. A. Fisher and his followers, this area has come to
     play an important role in the modern theory of statistical analysis. In our development, we
     examined conditions under which a (v, b, r, k, 4)-design could exist and how such designs
     were related to affine planes and finite projective planes. The text by M. Hall, Jr. [5] provides
     more on this topic, as does the work by A. P. Street and W. D. Wallis [15]. Chapter XIII of
     reference [15] includes material relating to designs and coding theory. A rather thorough
     coverage of the topic of designs is given in the work by W. D. Wallis [16], and the text
     edited by J. H. Dinitz and D. R. Stinson [2] provides the reader with a collection of more
     work in this area.

REFERENCES
         1. Albert, A. Adrian, and Sandler, R. An Introduction to Finite Projective Planes. New York:
            Holt, 1968.
         2. Dinitz, Jeffrey H., and Stinson, Douglas R., eds. Contemporary Design Theory. New York:
            Wiley, 1992.
         3. Dombowski, Peter. Finite Geometries. New York: Springer-Verlag, 1968.
832             Chapter 17 Finite Fields and Combinatorial Designs

Dorwart, Harold L. The Geometry of Incidence. Englewood Cliffs, N.J.: Prentice-Hall, 1966.

an We
                                                   Hall, Marshall, Jr. Combinatorial Theory. Waltham, Mass.: Blaisdell, 1967.
                                                   Herstein, [srael Nathan. Topics in Algebra, 2nd ed. Lexington, Mass.: Xerox College Publish-
                                                   ing, 1975.
                                               . Infeld, Leopold.   Whom   the Gods Love. New York: McGraw-Hill,                            1948.

ownmon~
                                            . Larney, Violet H. Abstract Algebra: A First Course. Boston: Prindle, Weber & Schmidt, 1975.
                                             . Liu, C. L. Topics in Combinatorial Mathematics. Mathematical Association of America, 1972.
                                             . McCoy, Neal H., and Berger, Thomas R. Algebra: Groups, Rings, and Other Topics. Boston:
                                               Allyn and Bacon, 1977.
                                         Il. Roman, Steven. Field Theory. New York: Springer-Verlag, 1995.
                                         12. Rothman, Tony. “Genius and Biographers: The Fictionalization of Evariste Galois.” The
                                               American Mathematical Monthly 89, no. 2 (1982): pp. 84-106.
                                             . Ryser, Herbert J, Combinatorial Mathematics. Carus Mathematical Monographs, Number 14,
                                               Mathematical Association of America, 1963.
                                         14, Stillwell, John. Mathematics and Its History. New York: Springer-Verlag, 1989.
                                         15. Street, Anne Penfold, and Wallis, W. D. Combinatorial Theory: An Introduction. Winnipeg,
                                             Canada: The Charles Babbage Research Center, 1977,
                                         16. Wallis, W. D. Combinatorial Designs. New York: Marcel Dekker, Inc., 1988.
                                         17. Zariski, Oscar, and Samuel,        Pierre.        Commutative Algebra,             Vol. 1. New York: Van Nostrand,
                                                   1958.

6. For any field F, let f(x)                  = x" + dy           xP      tee      bax   t+
               SUPPLEMENTARY EXERCISES                                           ao € F[x].Ifr,,m,..., 7, are the roots of f(x), andr, € F for
                                                                                 all | <i <n, prove that
                                                                                      a)        —a,~)      =   7   Ht roa tee        +Pp.
1, Determine n if over GF (n) there are 6561 monic polyno-
mials of degree 5 with no constant term.                                                  b)    (-1)"dp        = rirz-+ +The
                                                                                    7, Four of the seven blocks in a (7, 7, 3, 3, 1)-design are
2. a) Let f(x) = anx" +---+ a,x +a € Z[x]. If r/s € Q,
                                                                                  {1, 3, 7}, {1, 5, 6}, {2, 6, 7}, and {3, 4, 6}. Determine the other
    with gcd(r, s) = Land f(r/s) = 0, prove thats|a, andr |ao.
                                                                                  three blocks.
      b) Find the rational roots, if any exist, of the following
      polynomials over Q. Factor f(x) in Q[x].                                     8. Find the values of b and r for a Steiner triple system where
          i)   f(x) = 2x3 +3x? -2x -3                                             v = 63.
         ii)   f(x) =xt+x°— x? —2x -2
                                                                                   9. a) If a projective plane has 73 points, how many points lie
      c) Show that the polynomial f(x) = x! — °° 4 x70 4
                                                                                      on each line?
      x? + 1 has no rational root.
                                                                                          b) If each line in a projective plane passes through 10
3. a) For how many integers n, where | <n < 1000, can we
                                                                                          points, how many lines are there in the projective plane?
      factor f(x) = x? + x — n into the product of two first de-
      gree factors in Z[x]?                                                       10. A projective plane is coordinatized with the elements of a
                                                                                  field F. If this plane contains 91 lines, what are | F| and char(F)?
      b) Answer part (a) for f(x) = x7 + 2x — 7.
      c) Answer part (a) for f(x) = x? + 5x —n.                                  11. Let V = {x,, x2,...,x,} be the set of varieties and
      d) Let g(x) =x? +kx—neZ[x],            for 1<n < 1000.                     {B,, Bo, ..., By} the collection of blocks for a (v, b, r, k, A)-
      Find the smallest positive integer k so that g(x) cannot                   design. We define the incidence matrixA for the design by
      be factored into two first degree factors in Z[x] for all
      1 <n < 1000.                                                                                                                              1,        ifx,EeB
                                                                                               A = (4); )uxps             where a,, = {5                  otherwise
4. Verify that the polynomial            f(x) =x*+x°+x41                  is
reducible over every field F (finite or infinite).
                                                                                          a) How many 1’s are there in each row and column of A?
5. If p is a prime, prove that in Z,[x],
                                                                                          b) Let Jmxn be the m Xn matrix where every entry is 1.
                                                                                          For J,x, we write J/,. Prove that for the incidence matrix
                      xP   x=        lla           -a.
                                  aehy                                                    A,    A-   Ji,   =F.     Joxp   and   Jy    -A=Kk.         Jyxp-
                                                                                                                          Supplementary Exercises           833

c) Show that                                                                                  12, Given a (v, b, r, k, 4)-design based on the v varieties of
                              r         d            d                    X                  V, replace each of the blocks B,, for 1 <i < 5, by its comple-
                              d         r            Rowee                                    ment B, = V — B,. Then the collection {B,, Bo,..., B;}
          A-Av      =}        2         X            poe                                      provides the blocks fora (v, b, r’, k’, 4’)-design, also based on
                                                                                              the set V.
                        ry   r   Are                                                              a) Find this corresponding complementary (v, b, r’, k’,
                    =(r—A)I, +AS                                                                  A’)-design for the design given in Exercise 1 of Section 17.5.
where [,, is the v X v (multiplicative) identity.                                                 b) In general, how are the parameters r’, k’, X’ of the com-
                                                                                                  plementary design related to the parameters vu, b, r, k, 4 of
d) Prove that                                                                                     the original design?
det(A   . A‘)   =   (r   _   Ay   Tir       4   (v   _   1)A]   _—   (r       _   AY’ Tek.
    Appendix 1
  Exponential and
    Logarithmic
     Functions

Troe             the study of mathematics and computer science, one confronts exponential and loga-
                        rithmic functions. The function concept is introduced in Section 5.2, and in part (d) of Exercise
                    15 for that section we find the function f: R > R, where f(x) = e* for x € R. This is an example
                   of an exponential function. Then in Example 5.61 we come across the function f: R > R*, where
                   f (x) = e* — this time in conjunction with a logarithmic function, denoted In x, where x € R™. Later,
                   in Example 5.73 of this same chapter, another logarithmic function— namely, log, n, for n €« Z* —
                   appears in the analysis of an algorithm. And since these types of functions occur in later chapters as
                   well, we now provide this appendix as a review of some of the fundamental properties of these two
                   kinds of functions.

Let us start with the idea of positive integer exponents. For instance, we know that the expression
                   3’ indicates the multiplication of seven 3’s— that is,

37 =3.3-3-3-3-3-3
                                                                        = 2187.
                   In this example, the number 3 is called the base of 3’; the number 7 is the exponent, or power.
                   Generally, when the exponent is a positive integer, the base    — call it b—can be any real number
                   (including 0). In dealing with an exponent that is a negative integer, we use the following definition.

Definition A1.1   For every y nonzero real number b and every y n € Z*, we have b=” = 1/b".

From Definition Al.1 we see that
EXAMPLE Al.1
                     a) 3-7 = 1/37 = 1/2187                    b) (1/2)-* = 1/(1/2)8 = 1/1/64) = 64
                     c) (—3/5)~5 = 1/(—3/5)5 = 1/(—243/3125) = -3125/243

Finally, when our exponent is the integer 0 we define 6° = 1, for any nonzero’ real number b.

The preceding ideas can be summarized in the following, where we use the idea of a recursive
                   definition (introduced in Section 2 of Chapter 4) in the first part:
                      For all bE R,

“The expression 0° is called an indeterminate  form since its value may be different in different situations. This
                   idea is studied in calculus and is covered in conjunction with L’Hospital’s Rule.
A-2           Appendix 1   Exponential and Logarithmic Functions

1) b' = b, and b" =b-b""', forn € Zt wheren > 1;
                                   2)   ifb #Oandn              eZ", then b-” =          1/b"; and

3) ifb #0, then b® = 1.
                                   In order to proceed from integer exponents to those that are rational numbers, we recall from
                               earlier work in algebra that if g ¢ Z*, where g > 1, and b is any nonnegative real number, then the
                               expression b!/? denotes the gth root of b. Hence b!/¢ is the real numbera where a4 = b. For example,

32/5 = 2 because 2° = 32, and                            ~—(1/8)!/7 = 1/2 because (1/2)? = 1/8.
                               But when we are confronted with the equations 2” = 4 and (—2)? = 4, we must ask ourselves what we
                                shall mean here by 4'/*, The convention that is followed names the positive root as the one represented
                               by 41/7, so 4!/2 = 2, not —2 or 2. Likewise, 9!/* = 3, 16!/* = 4, and for all r € R, (r)'/? = |r|, the
                               absolute value of r, not just plain r. Also, though 2* = (—2)* = (27)4 = (—2i)* = 16, when the
                               expression 16/4 is encountered it denotes the positive fourth root, namely, 2.
                                   When + is a negative real number and g is an odd positive integer, our earlier definition of b!/4
                               continues to make sense. We find, for example, that (—8)'/* = —2 since (—2)? = —8 and no other
                               cube of a real number results in —8. However, for the case where g = 2, the expression (—4)!/?
                               denotes a complex number that is not real — and so we shall avoid such situations here.
                                   Finally, without getting into a detailed discussion on the development of irrational numbers, we
                                shall agree that real, but irrational, numbers such as 2!/? = 4/2 and (—5)'/3 = 3/—5 do exist and, in
                               general, for g €¢ Z* andr € R, the following real numbers also exist:

r/9= Yr, forr>0                      r'/4 = Yr, forr <Oandgq odd.

And now that we have settled this issue of exponents (or powers) of the form 1/g, where g isa
                               positive integer greater than 1, we pass to the following definition.

Definition A1.2          Letb € R and let p, g € Z. Then

1) bP/4       = (b'/4)?, forb > 0:
                                   2) boP/4 = (b'/4)-P = 1/[(b'/4)?], forb > 0;
                                   3) b?/4 = (b'/4)?, for b < 0 and q odd; and
                                   4) boP/4 = (bl/4)-? = 1/[(b'/4)?], for b < O and g odd.

This definition is illustrated in the following example.

a)     (8)°/3   =   82/3   =   (81/3)?   _   92   =4   (=   641/3    —    (87)'/3)
      EXAMPLE Al.2
                                  b) (81) = (81/4) 8 = 3-9 = 1/3? = 1/27 (= BY = [BUY P = (81) 47)
                                  e) (—1/32) = [(-1/32)'° = (-1/2)° = -1/8
                                  d) (—1024)~*/> = [(—1024)'/9]-? = (—4)~? = 1/(—4)? = 1/16 (= 1/(—1024)?’).

The last result observed in part (a) of the preceding example suggests the following, which is true
                                in general:

bY = (bPy'4,                    b> 0,       p.qgeZ.
                                The other parts of Definition Al.2 can also be extended as
                                                       boP/4     = (bP )'/4 = (1 /b?)'/4             = (1/b?/*), b > 0, p, gé    Zz.

bP/4 = (bP)/4, bb <0, p,g € Z*, g odd.
                                                       b-P/4 = (bP)/4 = (1/b?)/4 = (1/b?/*), b <0, p,q € Z*, g odd.
                                                                   Appendix 1 Exponential and Logarithmic Functions               A-3

Using 2 as our base, we know from Definitions Al.1 and A1.2 that
   EXAMPLE AI1.3
                        2-7 = 1/8,          27° = 1/4,         2-' = 1/2,          2° = 1,        2' = 2,       2 = 4,        27 =8
                      and that

2-3/2 = (2'/?)-3 = (/2)73 = (1/./2)3 = 1/(2/2) = 0.3535534
                                            23/2 = (./2)3 = 2/2 = 2.8284271(= (23)! = V8).

However,    how   do we deal with something like 2V3, where now              an irrational power confronts us?
                      Using the fact that /3 = 1.7320508 ..., we can evaluate the successive rational powers:

2'=2
                                                  217 = 217/10 = (217)1/10 = 131072!/!9) = 32490096
                                                  2!73 = 33172782
                                                 2!732 = 33218801
                                                21.7320 = 33218801
                                               21.73205 = 33219952

With the assistance of a hand-held calculator or a computer one finds that to seven decimal places
                      2¥3 is given as 3.3219971. If we want to be more precise, we can say that the real number 2? is the
                      limit of the sequence 2', 2'-7, 2'-73, 21-732, 21 7320, 91.73205       | (Qne studies such ideas in calculus and
                      introductory analysis.)
                          In a similar way one deals with the expression b”, where b €¢ R* andr eR.

Using the results we have now learned about exponents, we state the following properties — but
                      we do not prove any of them.

THEOREM Al.1          The Properties of Exponents. For all a, b € R* and all x, y ER,

1) (b*)(b°) = b* bY = bY,
                         2) (bY /(b*) = bY/bY = be,
                         3) (b*) = bb’ = b** = (b*)*, and
                         4) (ab)* = (a*)(b*) =a? - Bb’.

The properties in Theorem A1.1 are illustrated in the following.

1) 35/2. 33/2 = 315/+0/21 = 38/2 = 34 = g]
   EXAMPLE Al.4          2) (TMS) /(TVS) = T/A)                = 7-19/5 = 7-2 = 1/7? = 1/49
                         3) [V2 2 = (/2)® = (21/2)6 = 20/28 = 23 = 8

4) (3./5)4 = 34(/5)* = (81)(25) = 2025

We have now finished with the preliminaries needed to define an exponential function.

Definition A1.3   For a fixed positive real number }, the function f: R — R* defined by f(x) = b* is called the
                      exponential function for base b. [Sometimes we denote b* by exp,(x).]
A-4           Appendix 1   Exponential and Logarithmic Functions

a) In Fig. Al.1 we find the graphs of four functions:
        EXAMPLE Al1.5
                                                       fi:R->R*t,          fix) =x?       fp:R-oR*,            fo(x) =2*

fiR>R,              fp) ax         fa RoR’,             fax) =3*
                                     The functions f, and f3 are polynomial functions— nor exponential functions. Hence, when
                                     we examine the exponential functions f, and f4 we realize that there is a distinct difference
                                     between the expressions x? (for f;) and 2* (for f)), and between the expressions x° (for f;) and
                                     3* (for f4). The exponential functions f; and f, are such that

1) fox) > Oand f4(x) > 0, forallx € R—inparticular, f(x) > | and f,(x) > 1, forallx > 0,
                                        while0 < fo(x) < l andO < fa(x) < 1, forall x <0.
                                     2) for allx, ye R, x < y> fox) < fp) [and fu(x) < f4(y)]. (This is true for every expo-
                                        nential function where the base  > 1. That is, when b > 1 and x < y, then b* < b’,)
                                     3) if vy. w ER and fo(v) = fo(w), then v = w. [This property is also true whenever we are
                                        dealing with an exponential function f(x) = b*, for b> 1. So for v, wéE R and b> 1,
                                        bv =b"”>v=w.)]

f(x)

(3, 8)

>    X
                                                                      4
  (f)

(3, 27)
                                                                                                       {-3, 8)

(2, 9)                                (-2, 4)

-—1,2
  (—2, -8)                                                                   ->                                cre)
                                                                                                               |                  ay

(3)                                                                                                 (f) ae
Figure Al.1                                                                                           Figure A1.2

b) The graph of the function fs: R -> R*, defined by f5(x) = (1/2)* = 2>*, is given in Fig. A1.2.
                                     This graph demonstrates the following properties, which are true for all exponential functions
                                     f:R-> R*, where f(x) = b* for0 <b <1.

1) Here f5(x) > 0 for all x € R—but now we find fs5(x) > 1 for x <0 and fs(x) < 1 when
                                        x > 0.
                                     2) Ifx, yé R withx < y, then f5(x) > fs(y).
                                     3) For x, ye R, if f5(x) = fs(y), thenx = y.
                                                                                  Appendix 1 Exponential and Logarithmic Functions       A-5

c) When one speaks of the exponential function the reference is to the function f: R -> R*, where
                        f(x) = e* for the irrational number e = 2.71828. This function is shown as f¢ in Fig. A1.3,
                           where we have used the approximations e? = 7.38906 and e* = 20.08554. The function f; (also
                           in Fig. A1.3) is the exponential function where f7(x) = e*.

es                     (3, e)

e              (2, 7)
                                                     (0, 1)
                                        1
                                       T         T      T

-—3-2-1
                               (f6)
                             Figure A1.3

From property (3) in parts (a) and (b) of Example A1.5 we learned that for all b € R* and all
                   x,y ER, if b      1 and b* = b* then x = y. This observation helps us to solve the following expo-
                   nential equation.

For which real number(s) 7 is it true that (1/2)~®° = (1/8)7(!9+4)/39
EXAMPLE Al1.6         This equation can be written as 26° = 8'!+4/3 because (1/2)-" = [(1/2)7!]&" = 2° and
                   (1/8)   (0r+4)/3         _—   [(1/8) 71] Ue+/3         _—     g(l0n+4)/3    | Then

Jon? = gilOn+4)/3          Jn?     — (23) (lon+4)/3   =   Jon?   QUOn+4)

6n? = 10n +45             30? =S5n+25
                                            3n* — 5n —2 = 3n+ 1)(n—2)
                                                                  =03n = —-1/3 orn =2.

Now that we have examined the exponential function, we shall turn our attention to a second type
                   of function that goes hand-in-hand with the exponential function. This is the logarithm or logarithmic
                   function. However, before we introduce this function, we shall review some of the fundamental
                   properties of logarithms. First we consider the precise relationship between exponents and logarithms,
                   as described in the following definition.

Definition A1.4   Let b denote a fixed positive real number other than 1. If x € R*, we write log, x to designate the
                   logarithm of x to the base b (or the logarithm to the base b of x), which is the (unique) real number
                   y that satisfies b* = x.
                       This idea can be restated as follows: log, x is the exponent (or power) to which we raise the base
                   b in order to obtain x. Hence,

y = log, x if and only ifx = b’,

The following results are obtained from the preceding definition:
EXAMPLE A1.7
                     a) Since 2° = 8, we have log, 8 = 3.
                     b) One finds that log,(1/81) = —4 because 3-4 = 1/(3*) = 1/81.
A-6         Appendix 1 Exponential and Logarithmic Functions

c) For all b € R*, where b ¥ 1, it follows that
                                    i)   log, b= 1             because }! = b,
                                   ii) log, b? = 2             because b* = b?, and
                                  iii) log,(1/b) = -1          because b! = 1/b.
                               d) Since /7 = 7!/, it follows that log, /7 = 1/2.

Suppose that b, x € R* where + is fixed and different from 1. If log, x = 6, what is log,» x?
      EXAMPLE A1.8              We know that log, x = 6 <> b° = x, so x = (b’)?. And x = (b’)? <> log, x = 3. (In a similar
                             manner one also finds that log,3x = 2 and log,. x = 1.)

In conjunction with properties (1), (2), and (3) for exponents, as found in Theorem A1.1, the
                             following properties correspond for logarithms.

THEOREM     Al.2             Let b, r,s € R* where b is fixed and other than 1. Then

1) log, (rs) = log, r + log, s,
                                2) log, (r/s) = log, r — log, s, and
                                3) log, (r*) = s log, r.

Proof: We shall prove part (1) and request a proof for part (2) in the exercises at the end of this
                             appendix. For part (3) we shall only request (in the exercises) the proof for the case where s is a
                             nonzero integer— but we shall accept (without proof) and use the general statement given here.
                                 Suppose thatx = log, r and y = log, s. Then, becausex = log, r <> b* =r andy = log, s <<
                             b* = s, it follows from part (1) of Theorem Al.1 that rs = (b*)(b*) = b**”. Since rs = b*7? <>
                             log, (rs) = x + y, we have shown that

log, (rs) = x + y = log, r + log, s.

In our next example we find how the three results in Theorem A1.2 can be used to calculate
                             logarithms.

EXAMPLE   A1.9         Before the advent of computers and hand-held calculators, logarithms were used to assist in calculating
                   .         products, quotients, and powers and in extracting roots. Very often the base for these logarithms was 10
                             and tables of these numbers were available for working with logarithms. [Logarithms were invented
                             by the Scottish mathematician John Napier (1550-1617). Navigators and astronomers used them in
                             the seventeenth century to reduce the time it took to perform multiplication and division.]
                                 For example, since log,, 10 = 1 and log,, 100 = 2, one finds that 1 < log,,31 <2. In fact,
                             log,) 31 = 1.4914. Likewise, we have 2 < log,, 137 = 2.1367 < 3. From Theorem A1.2 it then fol-
                             lows that

1) logy 4247 = log,)(31 - 137) = log), 31 + logy, 137 = 1.4914 + 2.1367 = 3.6281,
                                2) log;(137/31) = logy, 137 — log,, 31 = 2.1367 — 1.4914 = 0.6453, and
                                3) logy) /137 = logy, 137!/3 = (1/3) logy, 137 = (1/3)(2.1367) = 0.7122.

In calculus we find use for logarithms to the base e = 2.71828, and these so-called natural log-
                             arithms are usually denoted by In x, for x € R*. When dealing with the analysis of algorithms in
                             computer science, logarithms to the base 2 often prove to be useful. But this does not mean we need
                             to be overly concerned about dealing with logarithms in several different bases. Many hand-held
                             calculators provide logarithms to the base 10 and the base e. And we’ll find in our next result that if
                             we can obtain logarithms in one base, we can use these to obtain logarithms in any other base.
                                                                       Appendix 1        Exponential and Logarithmic Functions             A-7

THEOREM Al1.3         The Base-Changing Formula. Let a, b € R* where neither a nor b is 1. For all x € R*,
                                                                                         ]
                                                                          log, x = bb
                                                                                         log, @
                      Proof: Let c =log,x             and d=log,x.            Then       b° =x =a‘            and log, x = log, a’ =d log, a=
                      (log, x)(log, a). Consequently, log, x = log, x/ log, a.

From a table or hand-held calculator one finds that log, 2 = In 2 = 0.6931 and log, 10 = In 10 =
   EXAMPLE A1.10      2.3026. Therefore, by virtue of Theorem A1.3, log, 10 = In 10/In 2 = 2.3026/0.6931 = 3.3222.

A special formula results from Theorem A1.3 when x = b. In this case we find that
| EXAMPLE ALI |
                                                                    log, b=       log, 08 Ob _          ]
                                                                                  log, a            log, a

Having reviewed the necessary preliminaries, it is time to define the logarithmic function.

Definition A1.5   Let b # 1 bea fixed positive real number. The function g: R* —> R defined by g(x) = log, x is called
                      the /ogarithmic function to the base b.

a) Consider the logarithmic functions
   EXAMPLE AI1.12
                                            gi: Rt > R,            gi (x) = log, x                 go: R* > R,          g(x) = log, x.

g(x)                                                         92x)

4
                                       a+                            (8, 3)                         3                       (27,3)
                                       3+                 (4, 2)                                    2
                                                                                                             (9, 2)
                                       27 ee
                                       1   +    J   (2,    1)                           (1,   Q)    1        (3,   1)

t++++++-+                 x                    Ht                                > x
                                  1+                 2345678                                       144369                     27
                                  9             44(1/2, =1)                                           (1/3, -1)
                                               (1/4, —2)                                           —2
                                (gy)                                                  (g>)

Figure A1.4

The graphs of these functions are shown in Fig. Al.4. These functions are such that

1) gi(x) > 0 and go(x) > 0 for all x > 1, and gi(x) < 0 and g2(x) <0 for all x <1, (This is
                               true for every logarithmic function log, x where b > 1.)
                            2) for all x, ye R*, x < y= gi(x) < gify) [and go(x) < g2(y)]. (Again this is true for all
                               logarithmic functions log, x where b > 1.)
                            3) ifu, vy € R® and g)(u) = g1(v), then wu = v. (In fact, for b > 1, we have log, u = log, v =>
                               u = v because w = log, u <> u = b”, and w = log, vv              = b".)

b) The graph in Fig. A1.5 is for the function g3: R* —> R defined by g3(x) = log,;., x. This graph
                           illustrates the following properties, which are true for those logarithmic functions log, x where
                           O<b< Il.
A-8   Appendix 1 Exponential and Logarithmic Functions

(g3)
                                                         Figure A1.5

1) Here g3(x) > 0 for allx < 1, while g3(x) < 0 for all x > 1.
                            2) For all x, y € Rt, ifx < y then g93(x) > 93(y).
                            3) If uw, ve R* and g3(u) = g3(v), then u = v. [The proof here is the same as that given in
                               section (3) of part (a).]

In part (a) of Fig. Al.6 we have the graphs of the functions f: R ~> R*, where f(x) = 2’,
                             and g: R* -» R, where g(x) = log, x. These graphs are symmetric (to each other) in the line
                             y = x —that is, if one were to fold the figure along the line y = x, then the graphs of f and g
                             would coincide. Here we also observe how the points on one graph correspond with the points
                             on the other. For instance, the point (2, 4) on the graph of f corresponds with the point (4, 2)
                             on the graph of g. In general, each point (x, 2*) on the graph of f corresponds with the point
                             (2*, x (= log, 2*)) on the graph of g, and (x, log, x) on the graph of g corresponds with (log, x,
                             x (= 2'°82*)) on the graph of f.

Figure A1.6

d) The graphs of the functions

h:R->R*,             h(x) = (1/2)"   k:R™ +R,        k(x) = logy) x
                             are shown in part (b) of Fig. Al.6. As in part (c) of this example these functions are also
                             symmetric in the line y = x. Here each point (x, (1/2)*) on the graph of f corresponds with
                             the point ((1/2)", x (= log, ,2,(1/2)")) on the graph of k, and (x, log,;,5) +) on the graph ofk
                             corresponds with (log, 9) x, x (= (1/ 2)'°8a/2 *)) on the graph of . (These two graphs intersect
                             on the line y = x where x = 0.6412.)
                         e) The reader may now want to examine, or reexamine, the graphs of the functions y = e* and
                            y =Inx shown in Fig. 5.10 of Section 5.6. In that section, the relationship of symmetry of
                                                                                          Appendix 1     Exponential and Logarithmic Functions             A-9

functions in the line y = x [mentioned above in parts (c) and (d)] is studied in conjunction with
                                                   the ideas of function composition and the inverse of a function.

8. Prove part (2) of Theorem A1.2.
                                  EXERCISES A.1
                                                                                          9, Let b, r © R® where d is fixed and different from 1.
1. Write       each     of the    following       in exponential        form,     for       a) For all n € Z*, prove that log, r” =n log, r.
x, yéER*.                                                                                    b) Prove that log, r~™” = (—n) log, r for alln € Z*.
    a) /xy3                       b) Bix ye)                      S/8x9y-S               10. Approximate each of the following on the basis that (to four
2. Evaluate each of the following.                                                      decimal places) log, 5 = 2.3219 and log, 7 = 2.8074.
    a) 125~4/3                    b) 0.0277/°                  c) (4/3)(1/8)
                                                                          7?                 a) log, 10                         b) log, 100
3. Determine each of the following.                                                         ¢) log, (7/5)                      d) log, 175
                                        3/5                                              11. Given that (to four decimal places) In 2 = 0.6931, In3 =
    a) (5°)(5'°"4)                ib) 78/5
                                        !                      ec) (5!/?)(20'/7)         1.0986, and In 5 = 1.6094, approximate each of the following.
  4, In each of the following find the real number(s) x for which                            a) log, 3              b) log, 2             c) log; 5
the equation is valid.
                                                                                         12. Determine the value of x in each of the following.
    a)   530°    _—    gixt2                  b)    4r-1   =   (1/2)!
                                                                                             a) logy) 2 + logy) 5 = logy x
    ce) (1/25)! = (1/125)"                                                                   b) log, 3+ log, x = log, 7 — log, 5
§. Write each of the following exponential equations as a log-
                                                                                         13. Solve for x in each of the following.
arithmic equation.
                                                                                             a) logi) x + log, 6 = 1
    a) 2’ = 128                               b) 125'° =5
                                                                                             b) Inx —In(x — 1) =1n3
    c) 10-+ = 1/10,000                        d) 2° =b
                                                                                             c) log, (x? + 4x + 4) — log,(2x — 5) =2
6. Find each of the following logarithms.
                                                                                         14, Determine the value of x if
    a) log,, 100                              b) log), (1/1000)
                                                                                           log, x = (1/3)[log, 3 — log, 5] + (2/3) log, 6 + log, 17.
    c) log, 2048                              d) log,(1/64)
                                                                                         15. Let b be a fixed positive real number            other than   1. If
    e) log, 8                                  f) log, 2
                                                                                         a,c €R*, prove that q!% ° = cle 4,
    g) logi, 1                                h) log,, 9
7. Solve for x in each of the following.
    a) log, 243 =5                             b) log, x = —3
    c) log,, 1000 = x                          d) log, 32 = 5/2
        Appendix 2
  Matrices, Matrix
    Operations,
and Determinants

Su": in Chapter 7, and then in several subsequent chapters, certain kinds of matrices have
                       been introduced. Historically, these mathematical structures were developed and studied in the
                   nineteenth century by the English mathematician Arthur Cayley (1821-1895) and his (English-born)
                   American coworker James Joseph Sylvester (1814-1897). Introduced in 1858, Cayley’s work in
                   matrix algebra provides another instance where research in abstract mathematics later proved to be
                   of importance in many applied areas — for example, in quantum theory in physics and data analysis
                   in psychology and sociology.
                       For those readers who may not have studied anything about matrices in earlier coursework or who
                   simply wish to review the matrix algebra we use in this text, the material in this appendix should
                   prove to be helpful. (We shall not prove all of the results in general here but state many of them in
                   conjunction with a given example. For a more rigorous development the reader should consult one
                   of the references at the end of this appendix.)
                       First and foremost, we start with the following.

Definition A2.1   For m,n € Z* anm X n matrix is a rectangular array of mn numbers arranged in m (horizontal) rows
                   and n (vertical) columns.
                       Anm Xn matrix A is denoted by A = (4,,)mxn, Where 1 <i <m and 1 < j <n, and the number
                   a;, is called the (i, j)-entry (that is, the entry that appears in the ith row and jth column of A). An
                   m X | matrix is often called a column matrix (or column vector); a 1 X n matrix is referred to as a
                   row matrix (or row vector). When m = n the matrix is called square.

1      2
                                                                                        12            0 3                     x      0
                   Let A = (a,;)3x2 =          2°          ;     B= thydoxa=| |              2   -1         7 fama      =| 7,
EXAMPLE A2.1                                                                                                                        vat
                      Here A is a 3 X 2 matrix where ay, = 1, aj. = 2, ar, = 0,
                                                                                ay» = 3, a3, = —5, and a3) = 4. The
                   matrix   B has two   rows         and       four columns,   where,   for instance,       one   finds the entries bj; = 0 and
                   bx4 = 7. In the 2 X 2 square matrix C we see that the entries in a matrix may be rational numbers and
                   even irrational numbers.
                       (Note: Although the entries in a matrix may even be complex numbers, in this appendix we shall
                   deal only with matrices where each entry is a real number.)

As with other mathematical structures, once the structure is defined one needs to decide when two
                   such structures are the same. The method for that decision is now addressed.

A-11
A-12          Appendix 2 Matrices, Matrix Operations, and Determinants

Definition A2.2         Let A = (4,;)mxn   and B = (5,,),.x, be two m X n matrices. We say that A and B are equal, and we
                               write A = B, whena,,          = »,, for alll <i <mandall1                  <j <n.

In Definition A2.2 we learned that two matrices are equal when they have the same number of rows
   EXAMPLE A2.2                and the same number of columns and when their corresponding entries are equal. As a result, if

_j}w          2     0                              _|—7                  0

Set
                                                        a=[%           3         |       and              a=(    0                 |:

nN
                               then for A and B to be equal we must have w = —7,x                        = 4, y   =2,2z    = 3,

Thinking back to our first encounters with arithmetic, after we learned how to count, we then
                               started to combine integers by using addition, and then multiplication. Along the same lines we now
                               consider how we may combine matrices.

Definition A2.3         If A = (4,,)mxn and B = (h,,) nxn are two m X n matrices, their sum, denoted A + B, is the m Xn
                               matrix C = (C,,)mxn, where c,, = a,, + b,,, forall l <i<m,l<j<n.

From Definition A2.3 we see that we can only add two matrices of the same size (where they
                               have the same number of rows and the same number of columns), Furthermore, the addition of two
                               matrices is carried out by adding their corresponding entries.

Consider the matrices
   EXAMPLE A2.3
                                                  1    3      4                      2    -1             6                                      1   -l
                                         A=j;2         0      6],           B= |     3          1        74,         and C =                   3.   -4
                                                  1    1      3                      4         2         2                                   —7        6

1+2       3+(-l1)             446               3      2          10
                                  Here we find thatA+ B=]                  243       041                 6+7        i=     {5      1         13     |. In fact, we also
                                                                           1+4       142                 342               5      3            5
                                                  3     210
                               have B+A=j;        5      1
                                                         13   |, which illustrates the following general result.
                                                  5     3 5
                                   For any two m Xn matrices EF and F, E + F = F + E. Hence the addition of matrices is an
                               example of a commutative (binary) operation.
                                   We cannot determine either of the sums A + C or B + C because each of A, B has three columns
                               while C has only two. However, we can find the sum

1     -1                      1   -1                 2          -2
                                                      C+C=}           3      -4]/4+]                3    -4]=               6          -8
                                                                    ~7          6          7                6            -14            12

In the last part of Example A2.3, we see that we could have obtained the result C + C by simply
                               multiplying each entry of C by the number 2. This leads us to the general idea we now state as follows.

Definition A2.4         If A = (4,,)mxn andr        € R, the scalar product rA is the m                 X n matrix where the (7, j)-entry is ra,,,
                               forall 1 <i<m,1l<j<n.
                                                                                      Appendix 2 Matrices, Matrix Operations, and Determinants                                                               A-13
|_ EXAMPLE A2.4           a) wa=|
                                 _
                                       j        y
                                                    6
                                                                5
                                                                     4
                                                                                 ten

34-31!                              6                    4]_[3-1                        3-6          3-4                _7—3                  18          12
                                           ~~           Q        -1                     -3                   3-0             3-(-1)       3-(-3)|/           |0                   -—3         —~9       |:

b) Fora =| 4

|
                                                     3Q                                                                            7 fwetina 32 =|                      ~6
                                                                                                                                                                                    30

|
                                                                                                                                                                                              21? rand

ee)
                                                                                 Ld

by
                                                                                      5a

I
                                                                                                                                                                       -15

ww
                                                                |

_
                                                                                                             |
                             sarmeas(ly 9 6 S]e[3
                                           _                1
                                                    0 5)
                                             4   ~2   2

©
                                                                                                                         —
                                        =3}              1           ©             6/_|  ~3                             18      18]_[3               18            12] [                  -6        0          6
                                                        —-5          0             4    -15                              0      12     0            -3            -9                     -15        3        21
                                        = 3A+ 3B.
                          ¢) The result in part (b) may be generalized as follows: For any twom
                                                                                                  Xn matrices E, F and any
                             reR,r(E+F)=rE-+rF. This principle is called the Distribut
                                                                                                ive Law of Scalar Multipli-
                             cation over Matrix Addition.

EXAMPLE A2.5 _|      a) Let A = (4,,)3x2 represent an arbitrary 3 X 2 matrix, andletZ=|
                                                                                                                                                                  0     0

0     0         |. Then
                                                                                                                                                                  0     0

a1           a2                                 0   0                      a,+0        @2+0                             ay           a2
                                A+Z=]               ay           ayn                    1+}         0   0]         =]          we +0       ay9 +O                |=]        a            ay    | =A,
                                                    43,          32                                 0   0                      a3; +0      ay+O0                            a3;          32
                            We say that Z is the additive identity (or zero element) for all 3 X
                                                                                                   2 matrices.
                                             1      1                -l    -l
                         b) When A =         2   -—3 | andB=|]|      -2       3 |, it follows that
                                          —4       5                   4-5

1+(-1)                   14(-1)                  0       0
                                                                A+B=]                               2+(-2)                   (-34+3            ]=][0         0
                                                                                                    (-4)+4                   5+4(-5)                0        0
                            Consequently, we call B = (—1)A the additive inverse of A, and also
                                                                                                write B = —A.

Hopefully, what we have done so far has proved to be somewhat interestin
                                                                                                     g. But what makes the
                      Study of matrices truly interesting and useful is the operation of
                                                                                         matrix multiplication. If one tries
                      to define this operation like the componentwise operation of matrix
                                                                                             addition, the result is of little
                      interest. Instead, matrix multiplication rests upon a row-and-column
                                                                                             multiplication and summation
                      where, for example,

by                                                        3
                                                Ia)             az               a3)]          by       =     a,b;           + dob»   + a3b3    =   Sab.
                                                                                               by                                                   1=1

Hence, in one particular case, we have

2
                                      [-1       4           3]] 1
                                                                             5
                                                                                          = (-1)-24+4-143-7= 244421 = 23.
A-14          Appendix 2 Matrices, Matrix Operations, and Determinants

In general, if a = (d,)1<,<n               is a 1 X n row vector and b = (6,),<,<,                                                   is ann            X         1 column vector,
                               then ab = yr        a,b,. This result, which is a real number, is called the scalar product of the vectors
                               (or matrices) a and b. This idea is the key we need for the following definition.

Definition A2.5         Given the matrices        A = (@,,)mxn             and B=                   (b,,)nx,, the (matrix) product                                        AB          is the matrix C =
                               (Cik mx ps, Where

Cre = A big + Apdo                 +++        + indy               = So           aurbies                  forall      1 <i<m,1<k<p.
                                                                                                                  t=1

Hence the entry c,, in the ith row and kth column of the m X p matrix C is obtained from the scalar
                               product of the ith row (vector) of A and the jth column (vector) of B.

The following demonstrates the result given by Definition A2.5.

a)      G2         «7+            Gin                     by          by
                                                               a2)     422,       +++            Ady                     bo,         bx

Git     Gig        tte            hin

am      Am?        cee           Amn                      bn}         bn2

Cu        Cy2         tt              CR             ttt                Lp
                                                                      C21       C220        +               Ck             "rt                €2p

C=
                                                                      Cy]       C12         ve              Cik            ore                Cip

Cm        Cm2         vt              Omk            ue                 Cp

~~)
                                                                                                                     1          2         1
       EXAMPLE A2.6

Wh
                                                                                                                                                                                                       o--
                                 a) Consider       the     matrices           A = (4,,)2x3 =                       3            0         4          and        B=        (b,x )3x3 =

Ww
                                                                                                                                                                                                             me

re
                                                                                      Ci         C12          C13
                                     Then AB = C = (¢,4)2x3 = |                                                  °             , where
                                                                                      €2)         -€22,—      C23
                                                                                                                                                                     emt ben

NN
                                                                                                                                         Wo ee

cy =1-14+2-14+1-0=3
                                                                                                                                                        ae

W
                                                                                                                                    1

un


                                                                                                                                                                 of

bo
                                                                                                                                                                      -
                                                                                                                                    |

a |
                                                                                                                                         el

c2          =1-242-34+1-1=9
                                                                                                                                                       oe

wet tO
                                                                                                                                    rr

or
                                                                                                                                         Lo

|
                                                                                                                                                                 1

mY
                                                                                                                                                                               do
                                                                                                                                    |

|

C3 =1-7+2-34+1-1=14
                                                                                                                                                       fs mee
                                                                                                                                          ee

orn
                                                                                                                                    ns

tors Ld
                                                                                                                                                                               GW
                                                                                                                                          Lo

Ke
                                                                                                                                    |

Ll
                                                                                                                                                                 |
                                                                                                                                                                 \

j
                                                                                                                                                                               NO

+]
                                                                                                                                                                      —
                                                                                                                                    |

|
                                                                                                                                         io ee

eC,   =3-14+0-144-0=3
                                                                                                                                                                                        —WwW
                                                                                                                                                                               tw
                                                                                                                                                                     fe
                                                                                                                                                        tf

oS
                                                         Appendix 2 Matrices, Matrix Operations, and Determinants                                                          A-15

12            4              1        2       7

O        1       1

1        2        1                   27
                                       63 =3-74+0-34+4-1=25                                    3            0        ‘|         1        3 3
                                                                                                                                         1 }

Consequently,

—_,_{/3                 9            14
                                                                  ap=c=[5                      10 33 |:
                                                                                                                                     1       2        7
               b) With A and B as in part (a) let us try to form the matrix producttBA=|1                                                    3        3       ;        ,     i:
                                                                                                                                     0       1            1
                  To find the entry in the first row and first column of BA we want to form the scalar product

[1   2    rif 5 ]ateaserer.

Unfortunately, we do not have enough entries in the first column of A, and so we cannot form
                  either this scalar product or the matrix product BA.
                      Now we may find ourselves wondering why we could form the product AB but couldn’t
                  form the product BA. Considering the product BA once again, we see that the difficulty hinges
                  on the fact that the first column of A did not have the same number of entries as the first row
                  of B. The number of entries in the first row of B is 3, which is the number of columns in B.
                  The number of entries in the first column of A is 2, which is the number of rows in A. These
                  observations lead us to the following general result.
                      If C isan m X n matrix and D is a p X g matrix, then the product C D can be formed when
                  n = p—that is, when the number of columns in C (the first matrix) equals the number of rows
                  in D (the second matrix). And when n = p the resulting product C D has m rows and g columns.

Let us examine matrix multiplication a little further.

12            1                            1    2                                         5 1       6                         4                      7
EXAMPLE A2.7   a)IfA=|1 to 3 and B=] 2                             0    2
                                                                            |. ten 48 =                          731 1      8
                                                                                                                            4
                                                                                                                                         white BA = | 4                      6
                                                                                                                                                                                  |
                  Consequently, even though it is possible to form both matrix products AB and BA, we do not
                  have AB = BA. In fact, these products are not even of the same size.
                              —1              I          1              2         .           _10                                   0          _|1                         =!
               b) For   =|         \        -j [ana e =|                5 frome finds that 4B = | j                                 p jand Ba = |                          “1 |

So here AB and BA are of the same size, but AB # BA.
               c) Finally, consider the matrices

2        ]
                                       1         1   3                                                                                                1       2
                         a=[4                4       5 |:          B=             (0       1           |,             and                c=|                      |:

Here we find that

1
                                           ae-[)
                                              fi
                                                 1 13 3]f                     0        1 _ff-[7 a]
                                                                                                -1
                                                                                                                                                     om
A-16         Appendix 2 Matrices, Matrix Operations, and Determinants

_f—7 -1    1                                                      2]_[-10                 -10
                                                                caByc =|“)    IE                                                       {=| a3                     war
                                        while

2         ]                             )                     5      0
                                                               BC       =             0         ]                3           i         =            3    —4                   and
                                                                                     —3                                                            —6    —2

5         0
                                                                             fi                     13                                              _f—-10              —10
                                                           aBe) =| |                          -1             ; |                        -           =| 45                "|:

Hence, (AB)C       = A(BC).
                                           In general, if m,n, p,q € Z* and A = (4,,)mxn, B= (bj nxp, and C = (Cy) pxq, then

(AB)C            = A(BC),

so matrix multiplication is associative (when it can be performed).

From the results in parts (a) and (b) of Example A2.7 we learn two important facts:

1) The operation of matrix multiplication is not commutative in general.
                                    2) Itis possible to find two nonzero matrices C = (¢,;)mxn(C., # Oforsome 1 <i <m,1< j <n)
                                       and D = (d,x)nxp, (dy, # O for some 1 < j <n, 1 <k < p), where CD = Z = (0),nx>.

In short, matrix multiplication does not necessarily behave like the multiplication of real numbers.

Now that we’ ve made some comparisons between matrix multiplication and the multiplication of
                              real numbers, let us pursue a few more.

a) When we consider square matrices — in particular, 2 X 2 matrices — we                                                                    learn that
       EXAMPLE A2.8
                                                                    a        b            1     O            _           1        O         a       b    _|a            b
                                                                    c        uh(Ud        0         1                    0        1         c       ad          c       dl’

Consequently, the matrix 7) = | ;                               ,        is called the multiplicative identity for all 2 X 2 matri-

ces. In general,   for a fixed positive integer n > 1, the matrix

_                                                         fd,               ifi = j
                                                                        Ln   ~       (51) nxn                    where            8,   ~~   | 0,         if i   #   j

is the multiplicative identity for all n X n matrices.
                                b       Returning to the real numbers let us recall that for each x € R, if x # 0, then there exists y ER
                                    —

where xy = yx = 1. This real number y is termed the multiplicative inverse of x and is often
                                        designated by x~!.
                                           We would like to know if there is a similar situation for square matrices
                                                                                                                 — and                                                                   we shall
                                        concentrate on 2 X 2 matrices.

where, b, c, d are fixed real numbers, can we finda matrix B =                                                       .    x
                                           IfA = |<        2
                                                                                                                                                                                              7
                                                                                                                                                                                              a

so that AB = BA = 1)? (Here w, x, y, z are unknown real numbers and our objective is to
                                        determine the values of these four numbers in terms of the given real numbers a, b, c, d.)
                                           Forming the product AB we find that

a   b                 w        x                  aw+by             ax+hz
                                                                AB =                                                          =                                                .
                                                                                      cod                   y        2                  cw+dy             cx+dz
                                     Appendix 2 Matrices, Matrix Operations, and Determinants                                A-17

For AB      to equal   /, —that is, for

| ete            ede                           |
                                               cwtdy        cx+dz                   0   1

— it follows from the definition of equality of matrices that

(1)     aw+by=1                                        (3)    ax+bz=0
     (2)     cw+dy=0                                        (4)    cx+dz=1.

Focusing on Egs. (1) and (2), if we multiply Eq. (1) by d and Eq. (2) by b, we find that

(1)’     adw+hdy       =d                              (2)'       bew + bdy = 0.

Subtracting Eq. (2)’ from Eq. (1)’, we learn that adw — bew = (ad — bc)w = d, so
     w=d/(ad —bc), if ad—bce #0. Similar calculations yield x = —b/(ad — bc), y=
     —c/(ad — bc), z = a/(ad — bc), and these formulas are also valid as long as ad — be # 0.
         [Note: (1) The real number ad — bc is called the determinant of the matrix A. (2) Although
     we determined the values for w, x, y, and z from the equation AB = Jy, it can be shown that
     the same solutions result when we deal with the equation BA = /).]

o[e }-[ 7
  c) Using the results in part (b), let

Then with ad— bc = 1-1—2-0=1                         (0),         it follows that w = 1/1 = 1, x = —2/1                 = —2,
     y = —0/1 = 0,z= 1/1 = 1, and

fo i]lo a}-Lo t}-[o T]lo af
     Under these circumstances we write A~! =                      )      ~   |

3
  d) Consider the matrix A; =                       , where the determinant of A; =3-2-—1-1=5                                (#0).
                                             1    2
                               “1        3       1yt_       2/5   -1/5 | _         2  -l
    Here we find that A,             F           >|        | is               3/5           1/5   1       3    |:
                                                                                                         1                   _
  e) From parts (b), (c), and (d), we can say thatifA = |<                        ’ J ten A=          det(A)        | “          ’ |

when det(A) = determinant of A = ad — bc # 0.
                                    1  2
  f) For the matrix A) = |               I one finds that the determinant of A, = 1-6 —2-3                                =0, soin
                                    3. 6
     this case there is no multiplicative inverse — that is, Ay' does not exist.

At this point we have developed some fundamental ideas about matrices, and the reader may be
wondering how one might use these mathematical structures. Therefore we return one more time to
the real numbers and some of the ideas we encountered in elementary algebra.
   When the equation 2x = 3 is solved the following list of equations may be written:

2x =3                                                                  (1)
                                                      (5) 2x) = (3) GB)                                                           (2)
                                                      [(5) 2]x = 3/2                                                              (3)
                                                          l-x = 3/2                                                               (4)
                                                             x = 3/2                                                              (5)
A-18   Appendix 2 Matrices, Matrix Operations, and Determinants

And in solving this equation the real number 1/2 (= 2-'), introduced in Eq. (2), is what we need to
                        “get the unknown, x, by itself” as we progress through steps (3) and (4) and get to step (5). So, in
                        general, if we start with the fixed real numbers a, b, where a # 0, then the equation ax = b has the
                        solution x = a~'b.
                            Now let us consider the system of linear equations:

Bx +y=3                                                                                   (*)
                                                                                                 x+2y                  =7,

which can be represented in matrix form as

[i als }-[7}
                        [This way of representing a system of linear equations is helpful in understanding the reason behind
                        the definition of matrix multiplication. For the left-hand side of each equation at (*) is the scalar
                                                                                                                                                            Xx
                        product of a row from the matrix                    ;    ;               with the column matrix                                          | If we let

[th ef} = Bi
                                                                                                                                                            y
                                                                                                                                                            v

then we are seeking a solution for the (matrix) equation AX = B. Could the solution here be X =
                         A~'B, considering that it was x = a~'b for the earlier equation ax = b?
                             Since the determinant of A = 3-2 —1-1=5 #0, from part (e) of Example A2.8 we know that

a                          2               -1)_                            2/5        —-1/5
                                                               =a                        |                    5 fo [ als                                  ss |:
                         Then we find that

Eri                                                     }-b]                                          f
                                                                                          3              1                 x                   3                                                      ,

2/5         —I1/5                 3               1             x                             2/5        —1/5         3                  2)!
                                                 —1/5            3/5                 1               2             y                           —1/5           3/5        7                 (

2/5          —-I1/5          3               1                     x         |_|          -l/5                                               3)
                                                  ys                  3/5        |}1 24)                      | y] >|                          18/5                                        (3)

Co LS }-Liss|                                                                                         7
                                                                                             1           0                 x       |_|         -1/5                                                       ,

Ls 1-Lasis |                                                                                      6
                                                                                                                           x           |_|         —I/5                                                   '

From   Definition     A2.2        it then      follows              from          the solution                        |;      =X=A'!B=            | ~ 1/5   | that
                                                                                                                                                                                 18/5
                         x = —1/S5and y = 18/5.

In general, if A =          fu       @2 | and B=                                 P|               with 11, 412, 421, 422, b,, by € Rand det(A) =
                                                         42,      422                                         by
                          11@o2 — 422,       # O, then the solution of the system of linear equations,

ax tay                                 = db,
                                                                                         ao\x + dny = hn,
                                                                         Appendix 2 Matrices, Matrix Operations, and Determinants                                  A-19

is given by

x=-|*           |aa7pe                  |       a2.                     a2                 by         |__|        (1/ det(A))(a22b1 — ay2b2)
                                 y                         det(A) | -—42                       ay                bo                    (1/ det(A))(-aaib; + ayib2) |
                   Furthermore, although we cannot prove our next result, the following is true for n € Z*, n > 2.
                       If A = (aj;)nxn is a real matrix (which has a multliplicative inverse), and B = (b,)j<<n, X =
                   (x, )i<:<n aren X 1 column matrices (like those defined earlier for n = 2), then the resulting system
                   of linear equations

AX =B

has the solution

X=A'B.

Now although we shall not deal with the inverses of any matrices larger than 2 X 2, we close this
                   appendix with some further results on larger determinants.
                      We already know that for A =                             od
                                                                               ab           | the determinant of A = det(A) = ad — bc. The det(A)

is generally denoted by | :                  d          In order to deal with the determinants of larger matrices we need
                   the following idea.

Definition A2.6   Let A = (@i))nxn with n > 3. For all 1 <i <n and | < j <n, the minor associated with a,, is the
                   (n — 1) X (n — 1) determinant obtained from matrix A after we delete the ith row and /th column
                   of A.

1   0       2
EXAMPLE A2.9         a) For A =                3   4       6             |, we find that
                                          -1       3       7

1) the minor associated with 0 is obtained from A by deleting its first row and second column:

10            2
                                                                    3.     4      6                  leads us to                       | 3   °   >;   and
                                                               -1          3      7

_                :            P       ail       a\2                  ]      0
                         2) for a2, = 6 the minor is                       ay         dn |           | 1          3

b) Given the 4 X 4 matrix

4     -3 2
                                                                                             6          9   4                      0
                         the minor associated with 3 is the 3 X 3 determinant

2            0        6
                                                                                              —3           -—2        5},
                                                                                                9            4        0

obtained from the matrix B by deleting the second row and first column of B (and replacing
                        the matrix brackets by the vertical bars for determinants).
A-20    Appendix 2 Matrices, Matrix Operations, and Determinants

Given a matrix A = (4,;)3x3, for all 1 <i <3,                                1 < j <3,         we shall let M;,   denote the minor
                         associated with a,,. Then

a           a2          443
                                     det(A) =]       42          ax          a3 | = an (-1)         My         + an(-)D!P
                                                                                                                        Mp + 433(-D!
                                                                                                                                   M3
                                                     43)         432         433

a2.         4x3                 a2,    423                a2;     a2
                                               =a                                  — a)2                  + a43
                                                           432         33                  a3,    433                a3,     432

and we say that we are evaluating det(A) by using an expansion by minors.
                            In this way we reduce the problem to 2 X 2 determinants that we know how to evaluate. Let us
                         examine a particular example.

EXAMPLEA2I0 |            ay 247
                               3 8 2/= 2-0" |8F 2S|ea-mre|>                                                          2gf+encn|>               3 58
                                 5         6     0                                                                                            >   6
                                                      =2(8.0-—2-6)-—43-0-—2-5)-—7(5
                                                                              :6-8.-5)
                                                = 2(-12) — 4(—10) — 7(—22) = 170.
                               [Note: In this expansion by minors we find a sum that uses each entry a,,, for 1 < / < 3, in the
                               first row of the determinant, and each such entry is multiplied by two terms:

1) (—1)'*, where the exponent | + j is the sum of the row number and column number for
                                  a,,; and
                               2) its associated minor M, ;.]

b) The reader may be wondering what is so special about the first row of a determinant. For suppose
                               we expand the determinant in part (a) by the third column. The resulting expansion is

3
                                                                                                                              2    4                  2   4
                                      So a3(-1)'3M3 = (-I(-)'8                                   3.8
                                                                                                 5   6
                                                                                                         | +   2(-   1)?+3
                                                                                                                              5    6   [+09           3   8
                                      1=1

= (-7)(3 6-8-5) —2(2:6-—4-5)
                                                                                              + 0(2.8 —4-3)
                                                                       = (—7)(—22)
                                                                             — 2(—8) = 170.

c) What has happened in parts (a) and (b) is not just a mere conicidence. In general, for any 3 X 3
                               matrix A, the determinant of A can be evaluated by expanding along any one (fixed) row or
                               down any one (fixed) column. And this method extends to larger square matrices — that is, for
                               ne&Z* where n > 4, ann X n deteminant can be expanded, along any one of its n rows or
                               down any one of its n columns, into n summands each of which involves an (n — 1) X (n — 1)
                               determinant.
                                   If A = (4,;)nxn, where n > 3, then

det(A) =          - a,,(—1)'*) M,,                  [expansion across the (fixed) 7th row]
                                                                 j=l

=           a,,(—1)'*!
                                                                         J        M, J             [expansion
                                                                                                      P       down the (fixed) jth column].
                                                                 1=1

d) From part (c) we now realize that if A = (4,,)nxn, for any n > 3, then if A has a row or column
                               where every entry is 0, it follows that the determinant of A is 0.
                                                                                                                                    Appendix 2 Matrices, Matrix Operations, and Determinants                                                 A-21
                                                            REFERENCES

The ideas presented in this appendix (and its corresponding exercises) should provide a sufficient
                                                                                  background for what is needed in the way of matrices and determinants in this text. For the reader
                                                                                  who would like to learn more about this area of mathematics, any one of the following should serve
                                                                                  as a good starting point.
                                                                                             1. Anton, Howard, and Rorres, Chris. Elementary Linear Algebra with Applications. New York:
                                                                                                     Wiley, 1987.
                                                                                         2. Lay, David C. Linear Algebra and Its Applications, 3rd ed. Boston Mass.: Addison-Wesley,
                                                                                            2003.
                                                                                         3. Strang, Gilbert. Linear Algebra and Its Applications, 3rd ed. San Diego, Calif.: Harcourt Brace
                                                                                                     Jovanovich,       1988.

-1 4
                                                            EXERCISES A.2                                                                  4, Let         A=                    1 2],     B=                   7            4 |     and       C=
                                                                                                                                                                                a                    13                      5
     tet            a=|                                                  3}         e-[i                    I          | and   C=
                                                        0                3                              1   2      4                        5       3               a | Show that (a) AB+ AC = A(B +C);

|    oo!
     5
         a) A+B
               4
                            3        | Find each of the following.

b) (A+ B)4+C
                                                                                                                                         and (b) BA+CA=(B+4+C)A.
                                                                                                                                              [In general, ifA isan m X n matrix and B, C aren X pma-
                                                                                                                                         trices, then AB + AC = A(B+C). Forn X p matrices B, C
          ec) B+C                                                                        d) A+(B+C)                                      and a p X q matrix A, it follows that BA +CA=(B+C)A.
         e) 2A                                                                           f) 2A4+3B                                       These two results are called the Distributive Laws for Matrix
                                                                                                                                         Multiplication over Matrix Addition.]
         g) 2C +3C                                                                       h) SC (= (2+. 3)C)
                                                                                                                                          5. Find the multiplicative inverse of each of the following
          i) 2B —4C (= 2B + (-4)C)                                                                                                       matrices if the multiplicative inverse exists.

of i                                             » LT o|
         j) A+2B-3C                                                                                                                                       1        2                                       0        1
         k) 2(3B)                                                                            1) (2-3)B

olsa}                                            ef a
    2. Solve for a, b, c, d if                                                                                                                            —3               1                                   7            -3

fe ale[s S]-2[s 3]                                                                                                     6. Solve each of the following matrix equations for the 2 x 2
    3. Perform the following matrix multiplications.                                                                                     matrix A.

a)[1                   3        7]
                                                        —2
                                                          0                                                                                     e[2 s}e=[i 3]
         b) ||                      ) 3 |
                                                          2
                                                             1
                                                            2 5
                                                                             4                                                                  mE sde-[p t]-Li 7]
                                                            3.               6                                                                                         1        1             _{|-1                2                    .
                                                                                                                                         7. it a=|                              >| and      e=| 7)                  a            determine     the
           )            1           —2              2                6                                                                   following.
         “lo                             3116                        8
                                                                                                                                                a) A!                                   b) Bo!                          c) AB
                   r 1                   1        —-l                               3            0      4
                                                                                                                                                d) (AB)"!                               e) B-'A7!
         d)}         2              -2              3                            —-1             0      6
                   | 4                0           —5                                7            7      2                                 8. Evaluate the following 2 < 2 determinants:

r1               0         0                  a               bee                                                                  1        2                                       5           10
         e)}         0               1        O                  de                          f                                                  Ml3            4                                 D3                     4
                   | 0              0         3                  g  h                        i                                                            5        2                                       5            10
                   ri               0 0                          ah                     e¢                                                      Olis               4 |                           qd) | 15               20
         f);         0              0         3                  de                                                                       9. Solve the following systems of linear equations by using
                   | Oo             1         O                  gohii                                                                   matrices:
A-22                     Appendix 2 Matrices, Matrix Operations, and Determinants

a) 3x —2y =5                                                           b) 5x + 3y = 35                                         b) State    a general                 result    suggested   by   the   answers   in
             4x —3y =6                                                        3x —2y =2                                            part (a).
                                                                       b                   .                                   15. a) Evaluate each of the following 3 X 3 determinants.
10. Leta, b,c, d€
                R with                                         | é           | = 7. Determine the value
                                                                 c    ad                                                                        1            2           1
of each of the following.                                                                                                               i)     |O       -1           -!I
                 3a           3b                                           3a              b                                                   2             3          0
    DV)                                                               b) | 3c              d |                                                  5                2           1
                     a         b                                                   3a      3b                                          ii)      0        -1              -!l
    V1 30                     3a                                       Ml          3.      3a |                                                10                3          0
11. Let A be a 2X2 matrix                                              with            det(A) = 31.               What    is
                                                                                                                                                    5            2             5
det(2A)? What is det(SA)?
                                                                                                                                      iii)      0        -1              —5
12. Expand each of the following determinants across the spec-                                                                                 10                3         0
ified row as well as down the specified column.
                                                                                                                                                                                               ab   e¢
                 1       0              -2
                                                                                                                                   b) Leta, b,c,d,e, fig, hieR                             If|d   e    f       |}=17,
    a)/}3                1              —1           |; row 2 andcolumn 3                                                                                                                      gh i
         4                |                2                                                                                       evaluate
                 1       1                2                                                                                                 3a           b           ¢
    b) | 2               3              —4           |; row    1 and column 2                                                           ) | 3d            e          f
                 Q       5                 7                                                                                                3g           ho          i
13. Expand each of the following determinants across any row                                                                                 3a            be
or down any column.                                                                                                                    ii) | 9d          3e  6f
        1                  0               2                    4   7         0                          1   2      -4                       3g           hh  2
    a)
     |6                  —2                1}            bb}    4   2         O            c)/0               1       0
       4                                                                                                                                     2a          2b               2c
                           3               2                    3.6           2                          3   3        2
                                                                                                                                      iii) | 3d          3¢e             3f
14, a) Evaluate each of the following 3 X 3 determinants.                                                                                      Se        Sh               Si
                                   1            2                                 ada                                          16. Let A = (4,))nx, and B = (b,,)nx, be two matrices. When
            i)                     1            3                    li)          |b e b                                       the matrix product AB is formed, as defined in Definition A2.5,
                                   1            4                                 c  f ¢                                       how many multiplications (of entries) are performed? How
                                                                                                                               many additions (of entry-products) are performed?
                                   2                 4                            de                 f
          iii)                     3.           -l                   iv)          |a     bie
                                   2                 4                            ab            ec
        Appendix 3
Countable and
Uncountable Sets

I:   Example 3.2 of Section 3.1 we informally mention the ideas of what we feel are a finite set and
                      an infinite set. This final appendix wil! deal with these issues in a more rigorous manner and will
                   help us attach some meaning to |A| (the size, or cardinality, of a set A) when A is an infinite set. To
                   develop these notions more precisely let us recall the following concept that was first introduced in
                   Section 5.6.

Definition A3.1   For any nonempty sets A, B the function f: A —> B is called a one-to-one correspondence if f is
                   both one-to-one and onto.

Let A=Z* and B=2Z* = {2k|k € Z*} = {2,4,6,...}.                  The   function   f: A-» B, defined    by
EXAMPLE A3.1       f(x) = 2x, is a one-to-one correspondence:

1) For a), a2 € A, we have f(a,) = f (a2) => 2a, = 2a) => a, = a2, so f is one-to-one.
                        2) Ifb € B, then b = 2a for some (unique) a € A, and f(a) = 2a = b, making f onto.

The result in Example A3.1 now leads us to consider the following.

Definition A3.2   If A, B are two nonempty sets, we say that A has the same size, or cardinality, as B and we write
                   A ~ B, if there exists a one-to-one correspondence f: A > B.

From Example A3.1 we see that Z* has the same size as 2Z*, even though it seems that 2Z~ has
                   fewer elements than Z* — after all, we do know that 277C       Z*.
                        If we define g: B —» A (for B = 2Z* and A = Z*) by g(2k) = k, then
                        1) g(2k,) = g(2kz) => ky = kp => 2k, = 2k, establishing that g is one-to-one; and
                        2) for each k € A, we have 2k € B with g(2k) = k, so g is also an onto function.

Consequently, g is a one-to-one correspondence and B ~ A.
                       So at least in the case of A = Z* and B = 2Z* we find that A ~~ B and B ~ A (even though
                   B c A). But, in reality, what has happened in this one situation holds true in general. For the function
                   g just defined is actually the function f—! for f in Example A3.1. And we learned in Theorem 5.8
                   that a function is invertible if and only if it is both one-to-one and onto. Consequently, whenever there
                   are two nonempty sets A, B with A ~ B then it follows from Theorem 5.8 that B ~ A, so we can say
                   that A and B have the same cardinality and denote this by |A| = |B]. (Note: It does not necessarily
                   follow that A = B.)
                       Let us consider another example.

A-23
A-24          Appendix 3 Countable and Uncountable Sets

For B = 2Z+ = {2k|k€Z*} and C = 3Z* = {3k|k € Z*}, the function 4: B > C defined by
   EXAMPLE A3.2               h(2k) = 3k establishes a one-to-one correspondence between B and C. Therefore we have B ~ C
                              (and C ~ B, and |B| = |C|). Furthermore, using the function f: A > B that was defined in Example
                              A3.1, where A = Z*, by virtue of Theorem 5.5 we know that h o f: A -» C is also a one-to-one
                              correspondence. So A ~ C (and C ~ A, and |A| = |C]).

What we have learned up to this point can be summarized as part of the following result.

THEOREM A3.1                  For all nonempty sets A, B, C,

a)     A~A;
                                b) if A~ B, then B ~ A; and
                                 ec) ifA~BandB~C,thenA~C.

Proof:
                                a) Given any nonempty set A, it follows that A ~ A because the identity function 14: A—         Aisa
                                   one-to-one correspondence.
                                b) If A ~~ B, then there exists a one-to-one correspondence f: A ~> B. But then f~': B -> A is
                                   also a one-to-one correspondence and we have B ~ A.
                                 c) When A ~ B and B ~C there exist one-to-one correspondences f: A—> B and g:B>C.
                                    Since g o f: A > C is also a one-to-one correspondence, it follows that A ~~ C.

We shall now use the ideas developed so far in order to define what we shall mean by a finite set
                              and by an infinite set.

Definition A3.3        Any set A is called a finite set ifA = @ or if A ~ {1, 2, 3,...,} forsomen € Z*. When A = # we
                              say that A has no elements and write |A| = 0. In the latter case A is said to have n elements and we
                              write |A| = n. When a set A is not finite then it is called infinite.

From this definition we see that if A is a nonempty finite set then there is a one-to-one correspon-
                              dence g: {1, 2,3,...,n}— A forsomen € Z*. This function g provides a listing of the elements of
                              Aas g(1), g(2),..., g(n) —a listing where we can count (or account for) a first element, a second
                              element, ..., and so on, up to an nth (last) element.
                                 Also when A is an infinite set we see that there is no n € Z* for which we can find a one-to-
                              one correspondence    f: A —   {1, 2,3,...,}.    But if A, B are both infinite sets, can we    conclude
                              automatically that |A| = | B| —that is, that there is a one-to-one correspondence between A and B?
                              This is the question we shall answer, in the negative, as we continue our discussion. For now we
                              introduce the following concept.

Definition A3.4        Aset A is called countable (or denumberable) if (1) A is finite or (2) A~ Z*.

We have seen that 27+ ~ Z* and 3Z* ~ Z*, andsince Z* ~ Z", it follows that the sets Z*, 2Z*,
                              and 3Z* are all countable sets. In fact, for all k € Z, k #0, the function f: Z* -> kZ*, defined by
                              f (x) = kx, is a one-to-one correspondence so kZ* is countable (and |kZ*| = |Z*|). Consequently,
                              the set of all negative integers
                                                           — that is, (—1)Z* — is a countable set.
                                  Furthermore, whenever A is infinite and A ~ Z*, we also have Z* ~ A, so there is a one-to-one
                              correspondence f:Z* — A which provides a listing of the elements of A—namely, f(1), f(2),
                              f@Q), ...—and in this way we can count (but never finish counting) the elements in A.
                                                                  Appendix 3. Countable and Uncountable Sets       A-25

Finally, as noted above, whenever A ~ Z* we have Z* ~ A. Consequently, a given set A can be
                   shown to be countably infinite (that is, both infinite and countable) by finding either a one-to-one
                   correspondence f: A > Z* or a one-to-one correspondence g: Z* > A.

Since Z*, (—1)Z*, and {0} are all countable, is Z = Z* U (—1)Z* U {0} countable?
EXAMPLE A3.3          Consider the function f: Z* —> Z defined by

_ f x/2,                forx even
                                                  FO) = | —(x—1)/2,           for x odd.
                   Here we find, for example, that

f(4) =4/2 =2         and       fGB) = -G-1)/2
                                                                               = -2/2 = -1.

We claim that f is a one-to-one correspondence where f(2Z*) = Z* and f(Z*t —2Z*) =
                   (—1)Z* U {0}. For suppose that a, b € Z* with f(a) = f(b).
                      1) Ifa, b are both even, then f(a) = f(b) > a/2 = b/2 >a          =b.
                      2) If a, b are both odd, then f(a) = f(b) => -(a — 1)/2 = -(b-1)/2
                                                                                       >                  a-1=b-1>
                         a=b.
                      3) Ifa is even and b odd, then f(a) = f(b)> a/2 = -—(b-1)/25a=-b4+1>a-1=
                         —b, with a —1> 1 and —b < 0. Hence this case cannot occur
                                                                                — nor can the case where a is
                         odd and    even.

Consequently, the function f is at least one-to-one.
                      Furthermore, for all y € Z,

1) if y = 0, then f(1) = 0;
                      2) ify > 0, then 2y € Z* and f(2y) = 2y/2 = y; and
                      3) ify <0, then —2y + 1 © Z* and f(—2y + 1) = —[(-2y 4+ 1) — 1]/2 = —(-2y)/2 = y.
                      So f is alsoan onto function and f: Z* — Zisaone-to-one correspondence. Hence Z is countable.

Although all of our examples of countably infinite sets have been subsets of Z, other countably
                   infinite sets are possible.

EXAMPLE A3.4_|     Let A = {1, 1/2, 1/3, 1/4, ...} = {1/n|n € Z*}. The function f: Z* -» A defined by f(n) = 1/n
                   establishes a one-to-one correspondence between Z* and A. Hence |Z*| = |A| and A is countable.

In order to take our development on countable sets one step further we now introduce the following
                   definition.

Definition A3.5   For n € Z*, a finite sequence ofn terms is a function f whose domain is {1, 2,3,..., 2}. Sucha
                   sequence is usually written as an ordered set {x|, X2, %3,-.., Xn}, where x, = f(z) forall 1 <i <n.
                      An infinite sequence is a function g having Z* as its domain. This type of sequence is generally
                   denoted by the ordered set {x,},<z+ or {X1, X2, X3,...}, where x, = g(i) foralli e Z*.

a) The set {1, 1/2, 1/4, 1/8, 1/16} can be thought of as a finite sequence — given by the function
EXAMPLE A3.5            f: A— Q      where A = {1, 2,3, 4, 5} and f(n) = 27-""".
                     b) The set A in Example A3. 4 can also be expressed as {1/n},,-<z+ —an infinite sequence given
                        by the function g: Zt — Q*, where g(n) = 1/n for eachne Z*.
A-26           Appendix 3 Countable and Uncountable Sets

c) The terms in a sequence need not be distinct. For instance, let f: Z* — Z, where x, = f(n) =
                                     (—1)"*!, for each positive integer n. Then {x,}nez+ = {X1, X2, X3, Xa, X5,...} = (1, -1, 1, -1,
                                     1,...}, but the range of f is only the two-element set {1, —1}.

Our next result ties together the concepts introduced in Definitions A3.4 and A3.5.

THEOREM A3.2                   If A is a nonempty countable set, then A can be written as a sequence of distinct elements.
                               Proof: There are two cases to consider.
                                   1) If A is finite, then A ~ {1, 2, 3,...n} (and {1, 2,3,...,}~ A) for some n € Z*. Hence
                                      there is a one-to-one correspondence f:{1,2,3,...,n}— A.
                                          Define a, = f (i) for each 1 <i <n. Then, since f is one-to-one and onto,
                                       {a, do, 43, ..., G,} iS a Sequence of the n distinct elements of A.
                                  2) For A infinite there is a one-to-one correspondence g: Z* —> A.
                                          Define a, = g(i) for all i € Z*. Since g is one-to-one, the elements of the infinite sequence
                                      {@|, do, a3, ... .} are distinct; {a,, a2, a3, ...} = A because g is onto.

Before moving forward let us retrace some of our steps and recall that Z* is countable as are the
                               subsets 2Z* and 3Z* (of Z*). This suggests that perhaps every subset of a countable set is itself
                               countable. To deal with this possibility we introduce the next two ideas.

Definition A3.6            1) The infinite sequence {a), a2, a3,...} = {a,}iez+     is a subsequence of Z* = {1,2,3,...}
                                      if for alli ¢ Z*, a; © Z* anda, <a,4).
                                  2) Let {xn}nezt+ and {yn}nez+ be two infinite sequences. We say that {y,}nez+ iS a subsequence of
                                     {Xn}nez+ if there exists a subsequence {a;},<7+ of Z* where for eachk € Z* we have y, = x,,.

a) {1,3,5,7,...}isasubsequence of Z*, as is {1, 2, 4, 7, 11, 16, .. .}. The first subsequence can
       EXAMPLE     A3.6              be given by the function f: Z* — Z* where a, = f(n) = 2n — 1. The second subsequence can
                                     be generated recursively by

1) c, = hQ1) = 1; and
                                     2) Cngt =A(N +1) =h(n) +n =c, +7, forn > 1.

b) Let {x,},cz+   and {yn}nez+   be two sequences where for each n € Z*, x, = f(n) = (—1)" +
                                     (1/n) and y, = g(r) =1+(1/Q2n)). So {xn)nez+ = {0, 3/2, -2/3, 5/4, -4/5, 7/6, 6/7,
                                     9/8, ...}—and {vabnez+ = (3/2, 5/4, 7/6, 9/8, ...} —and y, = X92, for all n € Z*. For the
                                     subsequence {aj },ez+ (of Z*) where a, = 2k for each k € Z*, we find that y, = x2, = X,, for
                                     each n € Z* —and this shows us that {y,},ez+ is a subsequence of {x,},ez+.
                                  c) For né Z* let x, = 1/n and let y, = 1/(3n). Then {x,},ez+ = {1, 1/2, 1/3, 1/4, 1/5, 1/6,
                                     1/7,...} and {yn}nez+ = {1/3, 1/6, 1/9, ...}. Now consider the subsequence {a;},<z+ (of Z*)
                                     where a, = 3k for each k € Z*. Then for all n € Z*, y, = 1/(3n) = X3n = Xa,,80 {Vn }nezt+ isa
                                     subsequence of {x,}nez+-

And now we turn to the following result for countable sets and their subsets.

THEOREM A3.3                   If S is an infinite countable set and A < S, then A is countable.
                               Proof: If A is finite, then from Definition A3.4 we know that A is countable. So assume from this
                               point on that A is infinite. Since S$ is countable, we can invoke Theorem A3.2 in order to list the
                                                                         Appendix 3 Countable and Uncountable Sets   A-27

elements of S as an infinite sequence of distinct terms — so we write S = {5), 52, 53, . . .}. Now define
               a subsequence {a,},¢z+ of Z* as follows:

a, = min{n|n € Z*, ands, € A}
                  dy = min{n|n € Z*,n > a, ands, € A}
                  a3 = min{n|n € Z*,n > a. ands, € A}
                   In general, once a), a2, a3,..., G, have been selected, we define a,4, = min{n|n € Z*, n > a,
               and s, € A}. Consider the “function” F: Z* — A given by F(n) = s,,. If m,n € Z*, we find that
               M = N => On = An => San = Sa, => F(m) = F(n), so there is no doubt that F is a function. To com-
               plete the proof that A is countable we need to show that F is a one-to-one correspondence.
                   Suppose that m, n € Z* with F(m) = F(n).Then F(m) = F(n) = Sq, = Sa, => Gm = Gy because
               the elements of the sequence S = {s), 52, 53, .. .} are distinct. Furthermore,
                                                                                            a,, = d, => m = n because
               the elements in the subsequence {a,}n-z+ of Z* are also distinct. Consequently, this function F is
               one-to-one.
                   Now let b € A. Since A C S$ = {s), 9, 53. ..} we can write b = s,, for some m € Z*. If m = a,
               then F(1) = su, = Sn = b. Ifm # ay, then since a, < a2 < a3 <---, there    isa smallest r € Z* such
               that a,_, <_m <a,.From the definition of the subsequence {a,,},<z+ we know that a, = min{t|t € Z*,
               t > a,_, ands, € A} —and sincem > a,_, and s,, € A, we havea, < m. Nowa, <mandm<a,=>
               a, =m, and so F(r) = sy, = S_ = b. Consequently, the function F is also onto.

From Theorem A3.3 we deduce that a given infinite set 7 is countable if and only if 7 has the
               same cardinality as a subset of Z*. So if there is a one-to-one function f: T ~> Z* (not necessarily
               a one-to-one correspondence), then this is enough to tell us that T is countable — for T ~ f(T) (or
               |T| = | f(P)|) and f(T) is countable.

Up to this point every infinite set we have examined has turned out to be countable. Could it be
               that all infinite sets are countable— and that for all infinite sets A, B we have |A| = |B|? The next
               result settles this issue.

THEOREM A3.4   The set (0, 1] = {x|x € R and 0 < x < 1} is not a countable set.
               Proof: If (0, 1] were countable, then (by Theorem A3.2) we could write this set as a sequence of
               distinct terms: (0, 1] = {71, r2, 73, .. .}. To avoid two representations we agree to write real numbers
               in (0, 1] such as 0.5 as 0.499 ...— so no element in (0, 1] is represented by a decimal expansion that
               terminates. Writing such decimal expansions for 7), r2, 73, ..., we get

ry = 0.4,14)24)34               14° -

ry = Q.d1 472423024 °°
                                                               13 = 0.431432€33434 °-*

ln       0.4 n1Gn2Gn34n4         see

where a,; € {0, 1, 2, 3,...,   8, 9} foralli,j            ¢€ Z..
                  Now consider the real number r = 0.4,b2b3 --- , where for each k € Z”,

h,            =    3,       if Akk   x           3
                                                           «            7,        if Gk    =           3,

Then r € (0, 1), but for every k € Z* we have r 4 r, —sor ¢ {r}, 72,73, 74, ...}. This contradicts
               our assumption that (0, 1] = {r), 72, 73, rq, ..-}.
A-28     Appendix 3 Countable and Uncountable Sets

The technique employed in this proof (of Theorem A3.4) is generally known as Cantor's Diagonal
                         Construction in honor of the (Russian-born) German mathematician Georg Cantor (1845-1918), who
                         introduced the idea in December of 1873.

When a set is not countable it is termed uncountable. So (0, 1] is uncountable. When a set A is
                         uncountable then (1) Z* and A do not have the same size, or cardinality, so Z* ~ A and the cardinality
                         of A is greater than that of Z* — that is, |A| > |Z*|, even though both A and Z* are infinite sets.
                            The following corollary provides another example of an uncountable set.

COROLLARY A3.1           The set R (of all real numbers) is an uncountable set.
                         Proof: If R were countable, then by Theorem A3.3 the subset (0, 1] of R would be countable.

Before continuing with anything new let us say a few more words about this notion of an uncount-
                         able set.

1) First and foremost we realize that Corollary A3.1 is a special case of the general result: For all
                                sets A, B,if A is uncountable and A C B, then B is uncountable.
                            2) Unlike the result in Theorem A3.3 we do not find in general that nonempty subsets of uncount-
                               able sets are uncountable. We may even have an infinite subset A of an uncountable set B
                               where A is countable   — for instance, let A = Z and B = R.
                            3) Following Theorem A3.3 we remarked that whenever we had a set A and could find a one-to-
                                one function f: A — Z*, then the set A had to be countable. We cannot reverse the roles of
                                A and Z* for the function f. If there is a one-to-one function g: Z* — A, the set A could be
                                uncountable. Just consider g: Z* —» R where g(x) = x foreach x € Z".
                             4) Consider the points in the Cartesian plane on the unit circle x? + (y — 1)? = 1. How large is
                                this set S = {(x, y)|x, y € Rand x” + (y — 1)? = 1} —thatis, is S countable or uncountable?
                                    In Fig. A3.1 we have a unit circle (in the plane) centered at C(O, 1). This circle is tangent to
                                the real number line (or x-axis) at the point where x = 0. The point P, on the circumference,
                                has coordinates (0, 2).

Figure A.3.1

Let (x, y) be any point on the circumference of the unit circle, other than the point P (0, 2).
                                For example, point Q is one such point, and R is another. Draw the line determined by P and
                                Q. This line intersects the x-axis at Q’. Likewise the line determined by P and R intersects the
                                x-axis at R’. Conversely, consider the points on the x-axis — except for the point where x = 0.
                                Two such points are 7’ and U’. The line through P and 7” intersects the unit circle at 7. Point U
                                is the point of intersection (on S) determined by the line through P and U’. Finally, correspond
                                                                                  Appendix 3 Countable and Uncountable Sets             A-29

P with P’ (on the x-axis where x = 0). In this way we obtain a one-to-one correspondence
                              between the elements of S and the set R. Hence |S| = |R|, so S is another uncountable set.

Summarizing what we now know about |Z| and |R| — namely, that |Z| < |R| —we now want
                      to determine whether |Q| = |Z| or |Q| = |R| or, perhaps, |Z| < |Q| < |R|. In accomplishing this we
                      shall prove something more general; to do so we start with the following.

THEOREM A3.5          The set Z* X Z* is countable.
                      Proof: Define the function f:Z* X Z* > Z* by f(a, b) = 243°. The result will follow if we can
                      show that f is one-to-one. For (m,n), (u,v) €Z* X Zt, f(m,n) = fu, v) > 23" = 23" >
                      m =u,    n =v,          by the Fundamental        Theorem      of Arithmetic.     Consequently,    f is one-to-one and
                      Z* X Z* is countable.

Before any statements can be made about the size, or cardinality, of Q, we first need to consider
                      the subset QM (0, 1] = {s|s € QandO <s < 1} of Q.

THEOREM A3.6          The set QM (0, 1] is countable.
                      Proof: First we must agree that each s in QM (0, 1] will be written in the (unique) form p/q, where
                      p,q €Z*       and have no common divisor other than 1. Now define f:QQ (0, 1]                             Z* X Z     by
                      f (p/q) = (p,q), andlet K = range f. For p/q, u/v € QM (0, 1], we find that f(p/q) = f(u/v) >
                      (p,q) = (4, v) > p =u and g =v=>         p/q =u/v, so f is a one-to-one function. Consequently,
                      Q/”  (0, 1] ~ K, asubset of the countable set Zt < Z*. From Theorem A3.3 it now follows that the
                      set Q 1 (0, 1] is countable.

As we continue in our efforts to determine |Q| we shall need the next two definitions and theorem.

Definition A3.7   Let ¥ be any collection of sets from a universe “tL. The union of all the sets in 4, written LU) Ace A,
                      is defined as {x|x € U and x € A, for some A € ¥}.
                          When & is a countable collection  — that is, # = {A,, Az, A3, .. .} — we may write UW ae5 A=
                      Ue,     An    =    U)   ez+   An.

In each of the following the universe “U is R.
   EXAMPLE A3.7
                        a) For eachn € Z* let A, = [n — 1, n). Then, for example, A, = [0, 1), Ao = [1, 2), and A3 =
                             [2, 3). For ¥ = {Aj, Az, A3,...} = {A,|i € Z*} we find that U,cg A= US, A,
                             = Users An = 10, +00).
                        b) Given        any    g € Q™       let Ay = (¢ —1/2,¢ + 1/2).          Here,    for instance,   Aj2 = (0,1),    Ag =
                             (7/2, 9/2), and Aj); = (19/6, 25/6). If F = {Aglq €Q*), then Ujeg A = Uj cgr Ag =
                             (—1/2,      +00).

Definition A3.8   Let ¥ be a collection of sets each taken from a universe UU. The collection & is called a disjoint
                      collection if for all A, Bin ¥, when A # B then ANB = @.

When     we       reexamine         the two   collections    in Example     A3.7     we   find that the collection    in
   EXAMPLE A3.8       part (a) is the only disjoint collection.
A-30     Appendix 3 Countable and Uncountable Sets

The concepts of a countable set and a disjoint collection of sets now come together in our next
                         result.

THEOREM A3.7             Let ¥ be a countable disjoint collection of sets, each of which is countable. Then U Acg Ais alsoa
                         countable set.
                         Proof: Since % is a countable disjoint collection, we may write ¥ = {A,, A2, Az, ...}, where
                         A;    A, = @ for all i,j € Z*, when i # j. Furthermore, for each n € Z*, A, is countable and can
                         be expressed as {dn1, Gn2, Gn3, -. .}, a Sequence of distinct terms. In order to show that U acm A is
                         countable, consider each x € U Ace A.
                             Since U neg A= LU         Ans we havex & A, for some (fixed) n € Z*, and this n is unique because
                         F is adisjoint collection. In addition, x € A, => X = Gy, forsomek € Z* (where k is fixed and unique).
                         Now define f: Uyeg A > Z* X Zt by f(x) = f (a,x) = (n, k). From Theorem A3.5 we know that
                         Z* X Z* is countable, so the range of f is countable. Consequently, the result will be established
                         once we show    that f is one-to-one. This readily follows, for if x = aan, y = @pg € UU Aaegx A with
                         f(x) = fQ), then f(ani) = f(Gpq) > (0, &) = Cp, gq) 1                = Pykh = G > Ank = Ayg X=           Y.

Note that the proof of Theorem A3.7 is valid if ¥ is finite (and oo is replaced by |#|) or if one or
                         more of the sets A,, i € Z”, is finite.

As aresult of Theorem A3.7 we can now deal with the cardinality of Q.

THEOREM A3.8             The set Q (of all rational numbers) is countable.
                         Proof: We start by recalling that Ag = QM (0, 1] is countable — from Theorem A3.6. Now for each
                         nonzero integern, let A, = QM (n, n + l]anddefine f,: A, — Ao by f(g) = q — n. Then f,,(q1) =
                         fn(Q2) > 41 —N = Go —N = qi = G2, 80 f, is one-to-one. Consequently, A, ~ f,(An) © Ao, and by
                         Theorem   A3.3 we have      A,   countable.   In addition, for all m,n   € Z,m   An    => Am OA,y   = BY. From
                         Example A3.3 we know that Z is countable, so # = {Ao, A), A-1, Az, A-2,...} is a countable
                         disjoint collection of countable sets. Therefore, by virtue of Theorem A3.7, it follows that U Ace A=
                         U ez A, = Q is countable.

So now we know that Z*, Z, and Q are all infinite and Z* ~ Z ~ Q while R                   is infinite and
                         R ~ Z*. Recall that any infinite set A, where A ~ Z*, is said to be countably infinite— and we shall
                         now denote the cardinality of such a set A by writing |A| = Xo, using the Hebrew letter aleph, with
                         the subscripted 0, to designate the first level of infinity. The cardinality of R is greater than Xo and is
                         usually denoted by c, for the continuum.

In our next theorem we shall improve upon the result in Theorem A3.7. The following lemma
                         helps with the improvement.

LEMMA A3.1               Let # = {A,, Az, A3. ...} be any countable collection of sets (from a universe U). Let G = {B,. By,
                          B3,...} be the countable collection of sets where B, = A, and B, = A, — Ur                A, forn > 2. Then
                         G is a countable disjoint collection and Ue,         A,   =   Ue   By.
                         Proof: First we establish that the countable collection % is disjoint. To do so we must show that for
                         all i,j € Z*, where i # j, we have B; 1 B, = U. If not, leti < j with B,              B, # Y. Forx € B, OB,
                         we find that x ¢ B, = A, — Ue            Ay =>x ¢ A,, because       1 <i < j — 1. But it also happens that
                         x€B,    =A;   Ui     Ay => x € A, because A; — Ul              A; C A,. (Note:   Ul      A, = 4 when i = 1.)
                                                                     Appendix 3. Countable and Uncountable Sets                      A-31

The contradiction   —x ¢ A, and x € A, —tells us that B, 1 B, =                   for alli, j € Z*, where i # j. So
                § is a disjoint countable collection of sets.
                                                               ©              oo                    .             oo
                     For the second
                               part — namely, that Ue,               Ay = U,        B, —start with x € Ue, Ax. Thenx € A,
                for some n € Z*, and let m denote the smallest such n. If m = 1, then x € A; = B, C Ur                                B,. If
                                                                                            m-1                          DO
                m>1,thenx            ¢ A,   forall 1 <j <m—    1, andsox
                                                                      € A, — Un,                   Ay = Bn CS Ur              B,. In either
                casex € Ue,           B, and Ue, Ay Sc UT          B,. For the opposite inclusion we find that y € UU                B=
                y € B,, for some (unique) n € Z* => y € Ay, for this same n € Z*, because B, = A, and B, = A, —
                     i-l                                                     oo                              oo
                UU         A, CA,,foralli > 2,.Then y € A, > ye Ur, Ax, $0 UT, B.S Ue,                                 A,. Consequently,
                Ue,        Ax   =   Un,     By.

As in the case of Theorem A3.7, the proof of Lemma A3.1 is valid if ¥ is finite (and oo is then replaced
                by |¥\).
                    From Lemma A3.1 we learn that the hypothesis of Theorem A3.7 can be weakened — the countable
                collection ¥ need not be disjoint. This is formally established as follows.

THEOREM A3.9    The union of any countable collection of countable sets is countable.
                Proof: If & = {A,, A>, A3, ...} is a countable collection of countable sets, construct the countable
                collection § = {B,, Bo, Bz, ...} as in Lemma A3.1. For each k € Z*, B, © Ax, so by Theorem A3.3
                each B, is countable. Lemma A3.1 tells us that Ue, Ax = Ue,                       B,, and from Theorem A3.7 we
                know that Ue,             B, is countable. Hence U Ace A=         Ue,   A, is countable.

Once again, should ¥ be finite, the proof of Theorem A3.9 remains valid (upon replacing each
                occurrence of 00 by |#|).

Following Theorem A3.8 we mentioned that |Z*| = Xp and |R| = c, where Xp < c. Although there
                is still a great deal more that can be said about infinite sets, we shall close this appendix by showing
                that these are not the only infinite cardinal numbers. In fact, there are infinitely many infinite cardinal
                numbers.

THEOREM A3.10   If A is any set, then |A| < |9P(A)].
                Proof: If A = @, then |A| = 0 and |P(A)| = |P(A)| = |{H}| = 1, so the result is true in this case. If
                A #W, let f: A~» P(A) be defined by f(a) = {a} for each a € A. The function f is a one-to-one
                function and it follows that |A| = | f(A)| < |P(A)|. To show that |A| # |P(A)| we must prove that no
                function g: A ~» P(A) can be onto. So let g: A > P(A) and consider B = {ala € A anda ¢ g(a)}.
                Remember that-g(a) C A and that B C A. With B € P(A), if g is to be an onto function there must
                exist a’ € A such that g(a’) = B. Now do we have a’ € g(a’) ora’ ¢ g(a’)? Exactly one of these two
                results must be true.
                    If a’ € g(a’) = B, then from the definition of B we have a’ ¢ g(a’) —and the contradiction:
                a’ € g(a’) and a’ ¢ g(a’). On the other hand, when a’ ¢ g(a’) then a’ € B —but B = g(a’). Once
                again we get the same contradiction.
                    Therefore, there is no a’ € A with g(a’) = B, so g cannot be onto, and hence |A| < |P(A)].

As a consequence of Theorem A3.10 we find that there is no largest infinite cardinal number. For
                if A is any infinite set, then |A| < |P(A)| < |P(P(A))| <---. However, there is a smallest infinite
                cardinal number. As we mentioned earlier, this is Xo.
A-32          Appendix 3 Countable and Uncountable Sets

REFERENCES

Since there is still more that can be said about countable and uncountable sets, the interested reader
                                 may want to examine one of the following for further information.
                                    1, Enderton, Herbert B. Elements of Set Theory. New York: Academic Press, 1977.
                                    2. Halmos, Paul R. Naive Set Theory. New York: Van Nostrand, 1960.
                                    3. Henle, James M. An Outline of Set Theory. New York: Springer-Verlag, 1986.

b) Find a one-to-one         correspondence    between   Z*     and
                        EXERCISES A.3                                     {2, 6, 10, 14, ...}.

1. Determine whether each of the following statements is true          3. Let A, B be sets with A uncountable. If A C B, prove that
                                                                       B is uncountable.
or false. For parts (d)-(g) provide a counterexample if the state-
ment is false.                                                         4. Let / = {r € RIr is irrational} = R — Q. Is / countable or
   a) The set Q* is countable.                                         uncountable? Prove your assertion.

b) The set R* is countable.                                         5. If S, T are infinite and countable, prove that § X T is count-
                                                                       able.
   c) There is a one-to-one correspondence between the sets N
   and 2Z = {2k|k € Z}.                                                6. Prove     that   Z*   X Z*   X Z*   = {(a, hb, Ola, b, cE Z*}     is
                                                                       countable.
   d) If A, B are countable sets, then A U B is countable.
                                                                       7. Prove that the set of all real solutions of the quadratic equa-
   e) If A, B are uncountable sets, then A M B is uncountable.
                                                                       tions ax? + bx + ¢ = 0, where a, b, c € Z, a # 0, is a count-
   f) If A, B are countable sets, then A — B is countable.             able set.
   g) If A, B are uncountable sets, then A — B is uncountable.         8. Determine a one-to-one correspondence between the open
2. a) Let A = {n?|n € Z*}. Find a one-to-one correspondence            interval (0, 1) and the open intervals (a) (0, 3); (b) (2, 7); and
   between Z* and A.                                                   (c) (a, b), where a, b€ Randa          <b.
                       Solutions

Chapter 1
               Fundamental Principles of Counting

Sections 1.1                      1. a) 13.   +b) 40 © ec) Therule of sum in part (a); the rule of product in part (b)
and 1.2—p. 11                     3. a) 288      b) 24
                                  5. 2xX2X1X 10 X 10 X 2 = 800 different license plates
                                  7. 2°     9. a) (14)(12) = 168 — b) (14)(12)(6)(18) = 18,144 — ce): 73,156,608
                                11. a) 124+2=14         b) 14x 14=196-       ec) 182
                                13. a) P(8,8)=8!        b) 7!    6!      15. 4! = 24
                                17, Class A: (27 — 2)(24 — 2) = 2,113,928,964
                                       Class B: 2!4(2'© — 2) = 1,073,709,056
                                       Class C: 2!2(28 — 2) = 1,040,384
                                19, a) 7!= 5040       ib) (4')(3'!) = 144 © ce) (5935 =720 ~~ d) 288
                                21. a) 12!/(3!2!2!2!) — b) 2[11!/(3'2!2!29]          oe) [7!/(21 2D) [6!/GB! 2)]
                                23. 12!/(4!3! 2! 3!) = 277,200           25.a)n=10       b)n=5         oc) n=5
                                27. a) (10')/(2!7!) = 360         ib) 360
                                    c) Let x, y, and z be any real numbers and let m, n, and p be any nonnegative integers.
                                    The number of paths from (x, y, z) to (x +m, y +n, z+ p), as described in part (a), is
                                    (m+n4+     p)!/Qntal pt).
                                29, a) 576      b) The rule of product
                                31. a)9X9X8X7X6X5= 136,080                                  b) 9X 10°
                                             (i) (a) 68,880       (b) 450,000
                                            (ii) (a) 28,560       (b) 180,000
                                           (iii) (a) 33,600       (b) 225,000
                                33. a)     2!°      pb) 3°           35. a) 6!       ~——b) 2(5!) = 240
                                37. (]§)9! 5! = 348,713, 164,800

Section 1.3—p. 24               1. (5) = 6!/(2! 4!) = 15. The selections of size 2 are ab, ac, ad, ae, af , bc, bd, be, bf, cd, ce, cf,
                                    de,df,andef.
                                 3. a) C(10, 4) = 10!/(416'!) = 210      —b) (7) = 12!/(7! 5!) = 792
                                       ec) C(14,12)=91 — d) ({3) = 3003
                                      - a) P(S, 3) = 60
                                        b) af.m     af,r            af,t         a,m,r     a,m,t
                                           a, r,t   frm,r           f,m,t        f,rt      m, r,t
                                       a) (75) = 125,970            b) (P)(2) = 44,100              e) D0%_, (10!89,)
                                                                                                                   (27)
                                      d)     ee   (°°)   12” ,)     e)     yes    C22)

.a) (§)=28   b) 70                     c) (3)=28        dd) 37
                                11. a) 120 b) 56~ ce)                    100
                                13.

15.
                                        (;) x)=
                                       a) (3) =105 — b) (%)
                                                         = 2300; (9);                     ) = 12,650                      |
                                17.    ® Diag              8 Diaepee=DLicnewe                                   a Lig
                                19.    (8) + CIVG) + () = 220,                   (2) + (2) + (PG) + (VG) = 705
                                       21° (Yeo G))
S-2           Solutions

21,      (3)                     (5) -n—n(n—4),n>4
                          23.      a) (§)  —b) (F)23)_—e)- (2) (2%)(-3)
                          25.      a) (f2)=12      bi 12 ¢) (,45)(2)(—1)(-1)? = —24
                                   d) —216                        ee) (,.°,,)(2°)(—1)?(3)(—2)? = 161,280
                          27.      a)         2°            b)     2!°     c)    3!°      d)    4            e)    4!°
                                          m+n\                           (m+n)!          (m+n)!              _                       (m+n)!
                          29. n(                   m       ) —"           int           oman                                 PG te pond@             dD!
                                                                   ome
                                                                  ~ (+ DT              iim           1)!           mann!
                                                                                                                                m+n
                          31. Consider the expansions of (a) [(] + x) — x]"; (b) [(2 + x) — (x + 1)]"; and
                                   (c) [(2+x)—x]".
                          33.                                                                    1

Section 1.4—p. 34          La)                                   DQ             9()             3@                       Sa           wD
                           7a) (8)                               b GH)          o &)           #1                 e& (8)        9 G)-©
                           9n=7                                  Ia) (f)               by (#) +3(2) +38) + @)
                          13.      a) (7)                   by)          2. G23)               5. G8)(24—                     «7. a) (18)            5?
                          19. (7?)      21. 24,310=)0"_,i         [forn = (3)]
                          23. a) Place one of the m identical objects into each of the n distinct containers. This leaves m — n
                              identical objects to be placed into the n distinct containers, resulting in
                                   (" rm                          ') = ("=") = ("=}) distributions.
                          25. a) 2°   _b) 24
                          27. a) Cty-')=4                                  b)    10      c)     48           d)    CT        DCB      YF    Oty       CTE!)   = 96

e) 180    f) 420

2n            _          2n \ _ (2n)! _                     (2n)!               — (Qn)!+ 1)               (2n)\n
Section 1.5-p. 40           1.
                                   (7)    (,",)- nin!                                     (n—Iin+1!                      |     (n+ 1)!n!      nln +1)!
                                   (2n)'[(n+1)—n] _   1                                   (Qn)!    1                           2n
                                               (n+ 1)tn!                    = cepa                         (GH                (*")
                                 » a)         5 (= b3); 14 (= by)
                                                                                            1  72
                                   b) For n > 0 there are b,                           (= ——(“" ) ) such paths from (0, 0) to (n, 2).
                                                                    (n4+1)\n
                                   c) For n > 0 the first move is U and the last is R.
                                 . Using the results in the third column of Table 1.10 we have:

111000                                            110010                                      101010
                                               123                                               125                                         135
                                               456                                               346                                         246

. There are bs(= 42) ways.
                                 . (1) When n = 4 there are 14 (= 64) such diagrams.
                                   (11) For each n > 0, there are b, different drawings of n semicircles on and above a horizontal
                                   line, with no two semicircles intersecting. Consider, for instance, the diagram in part (f) of
                                   Fig. 1.10. Going from left to right, write 1 the first time you encounter a semicircle and write 0
                                   the second time that semicircle is encountered. Here we get the list 110100. The list 110010
                                   corresponds with the drawing in part (g). This correspondence shows that the number of such
                                   drawings for n semicircles is the same as the number of lists of » 1’s and n 0’s where, as the list
                                   is read from left to right, the number of 0’s never exceeds the number of 1’s.

11. (;)                      (;) (6!)(6!) = (5)               (12!) = 68,428,800
                                7                        6    o         7                         a

Supplementary              1.      ()G) + G)@ + ()@)
Exercises—p. 43            3. Select any four of these twelve points (on the circumference). As seen in the figure, these points
                                   determine a pair of chords that intersect. Consequently, the largest number of points of
                                                                                                                Solutions           S-3

intersection for all possible chords is ('7) = 495.

a) 10%                         =
                                                         --- 4)(12)
                                                 b) (10)(11)                                 34/9!) (25 (3)
                                7, a) C(12,8)        b) P12, 8)           9. a) 12       b) 49
                               11. (1/11) [11!/(5! 3! 34]
                              3.9 OH+OO+O                             GH+OO+O                      GH+OO+O-
                                   b OG) +GG)                    Gi and Gi) ()(G) + GG)
                              15. a) 2(4)+ (@) = 343 ~~ b) [2(7) — 9] + (2)— 1] = 1200
                              17. a) (5)(Q!) — b) (3)(8!)
                              9. a) (7b) 20) + OA
                              21. 0= (1+ (—1)" = (t) — (7) +) — G) +--+ CDG), 80
                                   (+ G+
                                       @ t=                    O+Q+@)+--
                              23. a) P(20, 12) = 201/8!           —b) (7) (12/)
                              25. a) (1) + (3) +--+ 09) + G9) =                   Veo (ae)       BY Vo Cx")
                                   c)    n=2k+1,k>0: 0%, FUT)
                                         n= 2k k= TK, MF)
                              27. a) (oP') = a)
                                              = (021)
                                  b) ra Gi =Co dt Cp pte + G2) = 2
                              29. a) 11!/(7!49) by) [IL/(7! 49] — F41/(2! 2 1141/3! 1)
                                   ¢) [11/71 4D] + [10!/(6! 3! 1]+ [9!/(S! 2! 2!) + (81/4! 1139]+ (71/03! 49] [in part (a)]
                                        (L111/(7! 49] + [101/(6! 3! 1] + [91/(S! 2! 2] + (81/4! 1139] + (71/8! 491}
                                                      — [{[4!/(2! 29] + B/C! 1! 1] + [2!/2']} & {[41/G! 1D] + (31/2! 1D1}]
                                                         [in part (b)]
                              31. (3)(8) =540          33. ($)(12)(11)
                                                                  (10) 9) = 178,200

Chapter2
                    Fundamentals of Logic

Section 2.1~p. 54               1. The sentences in parts (a), (c), (d), and (f) are statements. The other two sentences are not.
                                3.a)0        b)O      oc)1       dO
                                5. a) If triangle ABC is equilateral, then it is isosceles.
                                   b) If triangle ABC is not isosceles, then it is not equilateral.
                                   d)   Triangle ABC is isosceles, but it is not equilateral.
                                7. a) If Darci practices her serve daily then she will have a good chance of winning the tennis
                                   tournament.
                                   b) If you do not fix my air conditioner, then I shall not pay the rent.
                                   c) If Mary is to be allowed on Larry’s motorcycle, then she must wear her helmet.
                                9, Statements (a), (e), (f), and (h) are tautologies.
                              11. a) 2=32          ~=~b) 2”      13. p:0;7r:0; 5:0
                              15. a)m=3,n=6                 b) m=3,n=9          c) m=18n=9            adAm=4,n=9
                                   e)   m=4,n=9
                              17. Dawn
S-4          Solutions

Section 2.2~p. 66        1.   a)   (i)     Pl\@iriqar|                               po@ar)          |        pog|           por|       poga(pen)

o;/olo}]                    o                    1                   1               1                 1
                                           0]         0]   1           0                   I                    1               1                 1
                                           QO}         140             0                   |                    1               1                 1
                                           QO}         1   1           1                   1                    1               1                 1
                                           1|/0]0                      0                   0                    0              0              0
                                           1/0]      1                 0                    0                   0               1             0
                                           ]    1 | 0                  0                   0)                                  0              0
                                           ]           1       1        1                   l                   1               1             1

(ili)
                                                 P\|q|(riqvrjp>@vr) | p>g | -~r> (p> q)
                                                 0}    07]   0              0                    l                  ]               1
                                                 0;    0]     1              1                  1                   ]               ]
                                                 0}     140                 1                    1                  1               1
                                                 0      ]     1             1                    1                  ]               1
                                                  1/01]    0                0                   0                   0               0
                                                  1/0]     1                1                    1                  0               1
                                                 ]         ]       0        ]                   1                   1               ]
                                                 1         1       ]        1                   1                   1               1
                              b)     [p>             (¢Vr)] <= I[-r -                (p>   q)]                 From part (ili) of part (a)
                                                                   <> [or - (=p Vv q)]                         By the 2nd Substitution Rule,
                                                                                                                        and (p > q) <=> (=p vq)
                                                                   =   [-(-p V gq) > 7-77]
                                                                                 By the 1st Substitution Rule,
                                                                                   and (s > t) <> (-t > —s) for any
                                                                                   primitive statements s, ¢
                                                 <> [(--p A7g) > +]              By DeMorgan’s Law, Double Negation,
                                                                                   and the 2nd Substitution Rule
                                                 <> [(p A7q) > 7]                By Double Negation and the
                                                                                   2nd Substitution Rule
                         3. a) For any primitive statement s, s V —s <=> 7. Replace each occurrence of s by p V (¢ Ar),
                            and the result follows by the Ist Substitution Rule.
                            b) For any primitive statements s, t, we have (s > f) <> (-—t - -s). Replace each
                            occurrence of s by p v q, and each occurrence of 1 by r, and the result is a consequence of the
                            lst Substitution Rule.
                         5. a) Kelsey placed her studies before her interest in cheerleading, but she (still) did not get a
                            good education.
                            b) Norma is not doing her mathematics homework or Karen is not practicing her piano lesson.
                            c) Harold did pass his C++ course and he did finish his data structures project, but he did not
                            graduate at the end of the semester.

7. a)
                                         P|qd | pVqA(pA(pag@)) | pag
                                         0] 0                                    0                       0
                                         0 | 1                                   0                       0
                                          1 |0                                   0                       0
                                          1] 1                                   1                        1

b) (=pAq)Vv
                                    (py (pv q)) = pvg
                         9, a) If0+0=0, then 1 + 1 = 1, (FALSE)
                            Contrapositive: If 1+ 1 #4 1, thenO0 +0 # 0. (FALSE)
                            Converse: If 1+ 1 = 1, then 0 + 0 = 0. (TRUE)
                            Inverse: If0+0 £0, then 1 +1 4 1. (TRUE)
                                                                                                                                Solutions   8-5

b) If —1 <3 and3+7 = 10, then sin (2) = —1. (TRUE)
                          Converse: If sin (3) = —1, then —1 <3 and 3+ 7 = 10. (TRUE)
                          Inverse: If —1 > 3 or3 +7 # 10, then sin (=) # —1. (TRUE)
                          Contrapositive: If sin (=)                        # —1, then -1 > 30r3+7        4 10. (TRUE)
                    11. a) ¢q>r)Vv—p_                                  b) (-qvr)Vvo7p
                    13.
                                                                    [pe ga@qer)aAtreop)) | (pr gQagoryatr—
                                                                                                         p))

ss


                                           |}

>
                                OCOoOm

Oooocooo
                                           HB rPoorHco

Or

coo
                                                          oF

Oooo
                                rRrer

Or
                                SR

Fe
                                                          Se

e
                    15.   a) (ptp)                             by) (ptp)t@tga            oo wt@atrt@g             4d) pt@t”
                          e)             (r ts) t (r ts), where r stands for p ¢ (g 7g) ands forg t+ (p ft p)

17.
                                 P{q|~Pl@) | opto | -Oot®@                                          |   Gpel-9
                                 0 |           0               0             0             0               0
                                 0 |           1               1             1             0               0
                                  1]           0                1            1             0               0
                                 1}            1                ]            1             1                1

19. a)               pV[pA(pVv4q)]                                    Reasons
                                         <> pvp                                           Absorption Law
                                         <= p                                             Idempotent Law of Vv
                          ce)            [((-pVv 79g) > (pAgAr)]                          Reasons
                                         = -(4-pV 7g) V(pPAGATr)                          soreactvl
                                          >} (4p Am) V(pAGANr)                            DeMorgan’s Laws
                                         S(PAGV(PAGATr)                                   Law of Double Negation
                                         = PpAg                                           Absorption Law

Section 2.3-p. 84    1.   a)
                                          Pla|{r|p>q|@vq@                                 \@vgor
                                          0 |            0]    0       1            0          1
                                          0|             0] 1           1           0          1
                                          0;              10           1            ]          0
                                          O}              141          1            1           I
                                            1/           0/0           0            ]          0
                                           1}            0; 1          0             1         1
                                           1}             140          1             ]         0
                                           1}             1] 1         1            1          1

The validity of the argument follows from the results in the last row. (The first seven rows may
                          be ignored.)
S-6   Solutions

c)
                                  P\|qairijqavr|pv@vr)|—7q | pvr
                                  010]0           0            0           1        0
                                  0o1o]1           1            1          1        1
                                  0o}11/0          |           1          0         0
                                  O}1]1            |           1          0          I
                                  1/0/90          0            1           1        1
                                  1/0] 1           1           1           1        1
                                  1/110            1           1          0         1
                                  1/1] 1           1           1          0         1
                    The results in rows 2, 5, and 6 establish the validity of the given argument. (The results in the
                    other five rows of the table may be disregarded.)
                  . a) If p has the truth value 0, then so does p A q.
                    b) When p v g has the truth value 0, then the truth value of p (and that of g) is 0.
                    c) If gq has truth value 0, then the truth value of [(p Vv g) A —p] is 0, regardless of the truth
                    value of p.
                    d) The statement g Vv s has truth value 0 only when each of g, s has truth value 0. Then
                    (p — q) has truth value 1 when p has truth value 0; (r -» s) has truth value 1 when r has truth
                    value 0. But then (p V r) must have truth value 0, not 1.
                  . a) Rule of Conjunctive Simplification
                    b) Invalid — attempt to argue by the converse
                    ¢) Modus Tollens
                    d) Rule of Disjunctive Syllogism
                    e) Invalid — attempt to argue by the inverse
                  - 1)and2)      ‘Premise
                    3)           Steps (1) and (2) and the Rule of Detachment
                    4)           Premise
                   5)                   Step (4) and (r > 74) <=} (7-79 — 7-r) =          (¢ > 77)
                    6)                  Steps (3) and (5) and the Rule of Detachment
                    7)                  Premise
                    8)                  Steps (6) and (7) and the Rule of Disjunctive Sylogism
                    9)                  Step (8) and the Rule of Disjunctive Amplification
                  - a)
                    1)                  Premise (The Negation of the Conclusion)
                    2)                  Step (1) and -=(-g —> s) <> 7(-799 V 5) <=} 7(g V 5) <3 7g Aas
                     3)                 Step (2) and the Rule of Conjunctive Simplification
                        4)              Premise
                     5)                Steps (3) and (4) and the Rule of Disjunctive Sy!logism
                      6)               Premise
                    7)                 Step (2) and the Rule of Conjunctive Simplification
                      8)               Steps (6) and (7) and Modus Tollens
                      9)               Premise
                    10)                Steps (8) and (9) and the Rule of Disjunctive SyNogism
                    11)                Steps (5) and (10) and the Rule of Conjunction
                    12)                Step (11) and the Method of Proof by Contradiction
                    b) 1)          p>g           Premise
                             2) -q-> 7p ___ Step (1) and (p > q) = (—q > —p)
                             3)    pvr            Premise
                             4)    -p>r           Step (3) and (p Vr) =        (7p > r)
                             5)    -~g-or          Steps (2) and (4) and the Law of the Syllogism
                             6)    -—rvs          Premise
                             7)    r->s           Step (6) and (~r Vs) = (r > 5)
                             8)    ..-g-> 5s      Steps (5) and (7) and the Law of the Syllogism
                                                                                               Solutions           S-7

11. a) p:1              ¢:0       rl       c) p,g.r           5:0
        b) p:0              ¢:0       r:Oorl   d) p.g.r:1         s:0
           p:0              glo       ord

13. a)

pP\|q{ripvq|napyr | (pvygaaCcpyvr) | avr | KCev@aaACpvnl
                                                     > @vr)
0/0]     0                 0             1              0                 0                       |
0);  04] 1                 0             1              0                  1                      1
0/1/00                     1             1               1                 1                      l
O{1)1                      1             1               1                 l                      l
1;  070                    l           0               0                 0)                      1
1}  0)    1                1            1               1                 1                      1
1]   170                   I           0               0                  1                      1
1/1       ]1               1            l               l                 l                      1

From the last column of the truth table it follows that [(p V gq) A(mp Vr)] > (¢Vr)isa
         tautology.
         b)     (i) Steps                            Reasons
                      1)        pv(q@aAr)            Premise
                      2)        (pVqQA@vr)           Step (1) and the Distributive Law of Vv over A
                      3)        pvr                  Step (2) and the Rule of Conjunctive Simplification
                      4)        pos                  Premise
                      5)        -pVs                 Step (4), p> s@rpvs
                      6)          rvs                Steps (3), (5), the Rule of Conjunction, and Resolution

(iii) Steps                                  Reasons
               1) pvq                                 Premise
               2) por                                 Premise
               3) a=pvr                               Step (2), p>rqa-rpvr
               4) [(pVg)ACpyvr)]                      Steps (1), (3), and the Rule of Conjunction
               5) qvr                                 Step (4) and Resolution
               6) rs                                  Premise
               7) —rVs                                Step (6), r->s<@q-7rvs
               8) [7 v@gA(rrvs)]                      Steps (5), (7), the Commutative Law of v, and the Rule of
                                                         Conjunction
                 9)             avs                   Step (8) and Resolution

(iv) Steps                                                     Reasons
               1) -~pvVqvr                                              Premise
               2) gqV(—pvr)                                             Step (1) and the Commutative and
                                                                           Associative Laws of V
                  3)       ~—g                                          Premise
                  4)       -qgvV(-pvr)                                  Step (3) and the Rule of Disjunctive
                                                                           Amplification
                  5)       [IgV (sp Vvr)JAl-¢ Vv (Apvr)]]               Steps (2), (4), and the Rule of Conjunction
                  6)       (7pvr)                                       Step (5), Resolution, and the Idempotent
                                                                           Law of A
                  7)       —r                                           Premise
                  8)       -—rv-p                                       Step (7) and the Rule of Disjunctive
                                                                           Amplification
                  9)       [(r Vap)A(-rv       mp)                      Steps (6), (8), the Commutative Law of v,
                                                                           and the Rule of Conjunction
                 10)       ..-=p                                        Step (9), Resolution, and the Idempotent
                                                                           Law of v
S-8          Solutions

c)   Consider the following assignments.

p:    Jonathan has his driver’s license.
                                           q:    Jonathan’s new car is out of gas.
                                           r:    Jonathan likes to drive his new car.

Then the given argument can be written in symbolic form as

—“pVd
                                                                                       pV-7r
                                                                                       —qVv-r
                                                                                       or

Steps                                     Reasons
                                    1) ~pv@q                                  Premise
                                    2) pv-r                                   Premise
                                    3) (pv -7r) A (-p Vv q)                   Steps (2), (1), and the Rule of Conjunction
                                    4) -rvq                                   Step (3) and Resolution
                                    5) qv-r                                   Step (4) and the Commutative Law of v
                                    6) —~gV-r                                 Premise
                                    7) (qV~mr) A (-q¢ Vv 7r)                  Steps (5), (6), and the Rule of Conjunction
                                    8) =r Vv -r                               Step (7) and Resolution
                                    9) olor                                   Step (8) and the Idempotent Law of v

Section 2.4—p. 100         . a) False     b) False      cc) False     dd) True’      e) False’  f) False
                           . Statements (a), (c), and (e) are true, and statements (b), (d), and (f) are false.
                         oe)

a) dx [m(x) A c(x) A j(x)]                         True
                             b) Ax [s(x) A c(x) A >m(x)]                       True
                             c) Vx [c(x) > (n(x) ¥ p(x))]                      False
                             d) Wx [(g(4) A e(x)) > p(x), or                   True
                                    Vx [(p(x) A c(x)) > mg(x)], or
                                    Vx [(g(x) A p(x)) > me(x)]
                             e) Wx [(c(x) A s(x)) >                 (p(x) Y e(x))]     True
                           - a)   (i) Ax g(x)
                                    (ii)        Ax [p(x) Aqg(x)]
                                    (iii) Vx (g(x) > =F)
                                    (iv) Wx [¢(x) > 71(x)]
                                     (v) Ax [gQx) At@)]
                                    (vi) Vx [(¢(x) Ar(x)) > s(x)]
                               b) Statements (i), (ii), (v), and (vi) are true. Statements (iii) and (iv) are false; x = 10 provides
                               a counterexample for either statement.
                               c)  (i) Ifx is a perfect square, then x > 0.
                                 (ii) Ifx is divisible by 4, then x is even.
                                (iii) If x is divisible by 4, then x is not divisible by 5.
                                (iv) There exists an integer that is divisible by 4, but it is not a perfect square.
                             d) (i) Letx = 0.         (iii) Let x = 20.
                           - a)    (i) True __ (ii) False        Considerx = 3,
                                    (iii) True         (iv) True
                               c)     (i) True          (ii) True
                                    (iii) True         (iv) False
                                                            For x = 2 or S, the truth value of p(x) is 1
                                                            while that of r(x) is 0.
                         11. a) In this case the variable x is free, while the variables y, z are bound.
                             b) Here the variables x, y are bound; the variable z is free.
                         13. a) p(2, 3) A p(3, 3) A p65, 3)
                               b) [p(2, 2) V p(2. 3) v p(2, 5)] Vv [pG, 2) v pG, 3) Vv pG, 5)] Vv [p65, 2) v pt, 3) Vv pG. 5)]
                                                                                                             Solutions         §-9

15. a) The proposed negation is correct and is a true statement.
                         b) The proposed negation is wrong. A correct version of the negation is: For all rational
                         numbers x, y, the sum x + y is rational. This correct version of the negation is a true statement.
                         d) The proposed negation is wrong. A correct version of the negation is: For all integers x, y, if
                         x, y are both odd, then xy is even. The (original) statement is true.
                     17. a) There exists an integer n such that n is not divisible by 2 but n is even (that is, not odd).
                         b)        There exist integers k, m, n such that k — m and m — n are odd, and k      — n is odd.
                         d) There exists a real number x such that |x — 3| < 7 and either x < —4 or x > 10.
                     19, a) Statement: For all positive integers m, n, if m > n, then m? > n?. (TRUE)
                            Converse: For all positive integers m, n, if m? > n’, then m > n. (TRUE)
                            Inverse: For all positive integers m, n, ifm <n, then m? <n?. (TRUE)
                            Contrapositive: For all positive integers m, n, if m? <n?, then m <n. (TRUE)
                         b) Statement: For all integers a, b, ifa > b, then a* > b?. (FALSE— let a = 1 and b = —2.)
                            Converse: For all integers a, b, if a* > b’, then a > b. (FALSE— let a = —5 and b = 3.)
                            Inverse: For all integers a, b, ifa < b, then a* < b?. (FALSE—let a = —5 and b = 3.)
                                   Contrapositive: For all integers a, b, if a* < b*, then a < b. (FALSE—let a = 1 and
                                   b = —2.)
                         ¢)        Statement: For all integers m, n, and p, if m divides n and n divides p, then m divides p.
                                   (TRUE)
                                   Converse: For all integers m and p, if m divides p, then for each integer n it follows that m
                                   divides n and n divides p. (FALSE    — let m = 1, n = 2, and p = 3.)
                                   Inverse: For all integers m,n, and p, if m does not divide      or n does not divide p, then m
                                   does not divide p. (FALSE
                                                          — let m =        1, n = 2, and p = 3.)
                                   Contrapositive: For all integers m and p, if m does not divide p, then for each integer x it
                                   follows that m does not divide n or n does not divide p. (TRUE)
                         e) Statement: Wx [(x? + 4x — 21 > 0) > [(x > 3) V & < —7)]] (TRUE)
                                   Converse: Wx [[(x > 3) V (x < -7)] > (a? + 4x — 21 > 0)] (TRUE)
                                   Inverse: Vx [(x? + 4x — 21 <0) > [(« <3) A @ > —7) II, or Wx [(0? + 4x — 21 <0) 5
                                   (—7 <x <3)] (TRUE)
                            Contrapositive: Wx [[(x <3) A (x => —7)] > (x? + 4x — 21 <0)], or Vx [(—7 <x <3) >
                             (x? + 4x — 21 <0)] (TRUE)
                     21. a) True’    b) False     cc) False    d) True      e) False
                     23. a) Va db[a+b=b+a=0]                 b) duValau=ua=a)]             ec) Va 05h      [ab = ba = 1]
                         d) The statement in part (b) remains true, but the statement in part (c) is no longer true for this
                         new universe.
                     25, a) dx Ay[Y>y)Atw-—y<O0)]                      bd) Avay[@<y)AWe[x>zvz>I]

Section 2.5—p. 116     . Although we may write 28 = 25+1+1+4+1=16+4+4+4,                            there is no way to express 28
                         as the sum of at most three perfect squares.
                              30
                              =25+4+1             40 = 36+4                 50 = 254 25
                              32
                              = 16+ 16            42=25+ 16+ |              52 = 36+  16
                              34
                              = 2549              44 = 364444               54 = 2542544
                              36
                              = 36                46 = 364+9+41             56 = 36+ 1644
                              38
                              = 36+1+41           48 = 164+ 16+ 16          58 = 49+9
                       . a) The real number 7 is not an integer.
                         c) All administrative directors know how to delegate authority.
                         d) Quadrilateral MN PQ      is not equiangular.
                       . a) When the statement Ax [p(x) v g(x)] is true, there is at least one element c in the
                         prescribed universe where p(c) V q(c) is true. Hence at least one of the statements p(c), ¢(c)
                         has the truth value 1, so at least one of the statements dx p(x) and Ax g(x) is true. Therefore, it
                         follows that dx p(x) V Ax g(x) is true, and dx [ p(x) Vv g(x)] > Ax p(x) Vv Ax g(x).
                         Conversely, if dx p(x) V dx g(x) is true, then at least one of p(a), q(b) has the truth value 1,
$-10   Solutions

for some a, b in the prescribed universe. Assume without loss of generality that it is p(a). Then
                       p(a) V q(a) has truth value 1 so Sx [p(x) Vv g(x)] is a true statement, and
                       dx p(x) Vv Ax q(x) => Ax [p(*) Vv q(x)].
                       b) First consider when the statement Vx [p(x) A q(x)] is true. This occurs when p(a) A g(a) is
                       true for each a in the prescribed universe. Then p(qa) is true [as is g(a)] for all a in the universe,
                       so the statements Vx p(x) and Vx g(x) are true. Therefore, the statement Vx p(x) A Wx q(x) is
                       true and Vx [p(x) A q(x)] => Wx p(x) A Wx q(x). Conversely, suppose that Vx p(x) A Vx g(x)
                       is a true statement. Then Vx p(x), Wx g(x) are both true. So now let c be any element in the
                       prescribed universe. Then p(c), g(c), and p(c) A q(c) are all true. And since ¢ was chosen
                       arbitrarily, it follows that the statement Vx [p(x) A g(x)] is true, and
                       Vx p(x) A Wx q(x) = Vx [p(x) A g(x)].
                    9. 1) Premise
                         2) Premise
                         3) Step (1) and the Rule of Universal Specification
                         4) Step (2) and the Rule of Universal Specification
                         5) Step (4) and the Rule of Conjunctive Simplification
                         6) Steps (5) and (3) and Modus Ponens
                         7) Step (6) and the Rule of Conjunctive Simplification
                         8) Step (4) and the Rule of Conjunctive Simplification
                         9) Steps (7) and (8) and the Rule of Conjunction
                       10) Step (9) and the Rule of Universal Generalization
                   11. Consider the open statements
                               w(x):    x works for the credit union
                                é(x):   x writes loan applications
                                c(x):   x knows COBOL
                               q(x):    x knows Excel
                       and let r represent Roxe and / represent Imogene.
                          In symbolic form the given argument is as follows:

Vi [w(x) > c(x)]
                                                            Vx [(w(x) A €(x)) > g(x)]
                                                            w(r) A >q(r)
                                                            q(i) A mci)
                                                          Jw Te(r) A awit)

The steps (and reasons) needed to verify this argument can now be presented.

Steps                                   Reasons
                                 1) Vx [w(x) > c(x)]                   Premise
                                 2) gli) A 7c)                         Premise
                                 3) -c(é)                              Step (2) and the Rule of Conjunctive Simplification
                                 4) w(i) > c(i)                        Step (1) and the Rule of Universal Specification
                                 5) -wi(i)                             Steps (3) and (4) and Modus Tollens
                                 6) Wx [(w(x) A £(x)) > g(x)]          Premise
                                 7) wr) A-7q(r)                        Premise
                                 8) ~g(r)                              Step (7) and the Rule of Conjunctive Simplification
                                 9) (wir) A £(r)) > g(r)               Step (6) and the Rule of Universal Specification
                               10) —(w(r) A €(7))                      Steps (8) and (9) and Modus Tollens
                               11) wr)                                 Step (7) and the Rule of Conjunctive Simplification
                               12) -w(r) v m£(r)                       Step (10) and DeMorgan’s Law
                               13) -£(r)                               Steps (11) and (12) and the Rule of Disjunctive
                                                                          Syllogism
                               14)   ». -é(r) A wwii)                  Steps (13) and (5) and the Rule of Conjunction
                                                                                                      Solutions        S-11

13. a) Contrapositive: For all integers k and @, if k, @ are not both odd, then ké is not odd — OR,
                       For all integers k and @, if at least one of k, £ is even, then ké is even.
                       Proof : Let us assume (without loss of generality) that k is even. Then k = 2c for some
                       integer c — because of Definition 2.8. Then k£ = (2c)£ = 2(c€), by the associative law of
                       multiplication for integers — and cé is an integer. Consequently, k£ is even   — once again, by
                       Definition 2.8. (Note that this result does not require anything about the integer @.)
                   15. Proof : Assume that for some integer n, n? is odd while n is not odd. Then n is even and we may
                       write n = 2a, for some integer a — by Definition 2.8. Consequently, n? = (2a)? = (2a)(2a) =
                       (2 .2)(a - a), by the commutative and associative laws of multiplication for integers. Hence, we
                       may write n? = 2(2a7), with 2a? an integer   — and this means that n* is even. Thus we have
                       arrived at a contradiction, since we now have n* both odd (at the start) and even. This
                       contradiction came about from the false assumption that n is not odd. Therefore, for every
                       integer n, it follows that n? odd => n odd.
                   17. Proof:
                           (1) Since n is odd, we have n = 2a + 1 for some integer a. Thenn + 11 = (2a+1)+11=
                               2a + 12 = 2(a + 6), where a + 6 is an integer. So by Definition 2.8 it follows that
                               n+ I] is even.
                           (2) If2 + 11 is not even, then it is odd and we have n + 11 = 24 + 1, for some integer b. So
                               n= (2b+ 1) — 11 = 2b — 10 = 2(b — 5), where 6 — 5 is an integer, and it follows from
                               Definition 2.8 that nm is even — that is, not odd.
                           (3) In this case we stay with the hypothesis— that n is odd      — and also assume that n + 11 is
                               not even— hence, odd. So we may write n + 11 = 2b + 1, for some integer b. This then
                               implies that 2 = 2(b — 5), for the integer b — 5. So by Definition 2.8 it follows that n is
                               even. But with n both even (as shown) and odd (as in the hypothesis), we have arrived at
                               a contradiction. So our assumption was wrong, and it now follows that n + 11 is even for
                               every odd integer 7.
                   19. This result is not true, in general. For example, m = 4 = 27 andn = 1 = 1° are two positive
                       integers that are perfect squares, but m + n = 2? + 1* = 5 is not a perfect square.
                   21. Proof:
                       We shall prove the given result by establishing the truth of its (logically equivalent)
                       contrapositive.
                           Let us consider the negation of the conclusion     — that is, x < 50 and y < 50. Then with
                       x < 50 and y < 50 it follows that x + y < 50 + 50 = 100, and we have the negation of the
                       hypothesis. The given result now follows by this indirect method of proof (by the
                       contrapositive).
                   23. Proof : If n is odd, then n = 2k + 1 for some (particular) integer k. Then 7m + 8 = 7(2k +1) +
                       8 = 14k +7+8          = 14k 4+ 15 = 144k 4+ 144+ 1 = 2(7k + 7) + 1. It then follows from Definition
                       2.8 that 7n + 8 is odd.
                           To establish the converse, suppose that 7 is not odd. Then n is even, so we can write n = 21,
                       for some (particular) integer rt. But then 7n + 8 = 7(2t) + 8 = 147 +8 = 2(7t +4), soit
                       follows from Definition 2.8 that 7n + 8 is even— that is, 7n + 8 is not odd. Consequently, the
                       converse follows by contraposition.

Supplementary
                                                                                f
Exercises—p. 120
                         P|@iris|qar|-7svr)                        |   [qAr)o~-aGvr)] | pet

0/0/01]       0           0          1                  1                0
                         O;}0;]0)]       1         0         0                   1                0
                         0;   0]   14] 0           0         0)                  1                0
                         0;   0/1)     1          ()         0                   l                0
                         OO]   1   ;]0]0          (0)         ]                  1                0
                         O;1/]0]       1           0         0                   |                0
                         Oo;   1}]140                l       0)                 0)                ]
                         Oo;   1)])14]1             |        0                  0                 1
$-12   Solutions

t

pPla@alris|qar|-awsvnr |                               Iqan-a(svr)] | pet

1/0/01] 0                     0       I                                l              1
                            1;0/0]          1             0      0                                 1              1
                            1}0]140                       0      0                                l               1
                            1/}0);      14    1           0      0                                l               1
                            1};1/901]0                    0      1                                1                1
                            1]    1]o0]}      1           0      0                                l               1
                            1}    1]140                    1     0                                0               0
                            1]    1]    1)]   1            l     0                                0               0

3. a)
                                   Plgiri{qeor|peger) | pg                                             | Doger
                                   0;   0; 0         1                     0                  |              0
                                   0;   0] 1         0                      1                 1               1
                                   0}    1/0         0                      1                0                1
                                   0}    1) 1        ]                     0                 0               0
                                     1| 01/0         1                      1                0                1
                                    1/0} 1           0                     0                 0               0
                                    1}   140         0                     0                 1               0
                                    1/1] 1           1                      1                ]                1

It follows from the results in columns 5 and 7 that [p @ (gq eo rn)|] <= [(pegqg) <r].
                       b) The truth value assignments p: 0; g: 0; r: 0 result in the truth value | for [p -> (¢g > r)] and
                       the truth value 0 for [(p — qg) — r]. Consequently, these statements are not logically equivalent.
                    5. (1) If Kaylyn does not practice her piano lessons, then she cannot go to the movies.
                       (2) If Kaylyn is to go to the movies, then she will have to practice her piano lessons.
                    7. a)               (sp V 7q) A (Fov p) A p
                       b)               (=pV—7q)  A (Fo Vv p) Ap
                             = (“pV 7q) A(PA Pp)                                Fo V p=p
                             =          (“pV 7g) Ap                             Idempotent Law of A
                             =          pA(mpvn~gq)                             Commutative Law of A
                              —S        (PAAp)V (pang)                          Distributive Law of A over v
                            <> Fov (pA7@)                                       pA7p
                                                                                   => Fo
                          <> pA7q                                                Fo is the identity for v.
                    9. a) contrapositive            b) inverse              c¢) contrapositive      d) inverse         e) converse

2)          To     lair | p¥a | Yar                        | avr | p¥@Yn
                                   0 | 0 | 0         0                0                 0               0
                                   0/0)       1      0                 1                 1               1
                                   0;    1/0          1                l                 1               l
                                   Oo};  1/1          1               0                 0               0
                                    1/0      ;0       1               1                 0                l
                                    1/0] 1           1                0                  !              0
                                    1/140            0                0                  1              0
                                    1/1]   1         0                1                 0                1

It follows from the results in columns 5 and 7 that [(p ¥ g) Yr] <> [p ¥ (¢ VY r)].
                       b) The given statements are not logically equivalent. The truth value assignments p: 1; ¢: 0;
                       r: 0 provide a counterexample.
                   13. a) True      b) False      c) True’   d) True_    e) False’    f) False    g) False    ih) True
                   15. Suppose that the 62 squares in this 8 < 8 chessboard (with two opposite missing corners) can be
                       covered with 31 dominos. The chessboard contains 30 blue squares and 32 white ones. Each
                                                                                                                                   Solutions                S-13

domino covers one blue and one white square— for a total of 31 blue squares and 31 white ones.
                                    This contradiction tells us that we cannot cover this 62-square chessboard with the 31 dominos.

Chapter 3
                     Set Theory

Section 3.1—p. 134         1. They are all the same set.
                           3. Parts (b) and (d) are false; the remaining parts are true.
                           5. a) {0,2}              b) {2,23.33,55,75}           ¢) {0, 2, 12, 36, 80}
                            7. a) Vx [xe ASD xXEBJAAx         [ve BAX ¢€ A]
                                b) Ax [x EAAX EB) VVx [x EBV xEA]
                                OR, dx [x EAAX EB) V Vx [x € B>x EA]
                              .a) |A!=6          b) |B) =7 ~ c) If B has 2” subsets of odd cardinality, then |B| = 2 + 1.
                          11. a) 31        b) 30      ¢) 28     13. a) (2)    b) @)       9 @+A4+Q4+@)
                          15. Let W = {1}, X = {{1}, 2}, and Y = {X, 3}.
                          17. c) IfxeA,thenACB>axeB,andBCC>x €C.Hence ACC. Since B C C, there
                                exists y € C with y ¢ B. Also, AC Bandy ¢ B => y ¢ A. Consequently, AC C and ye C
                                wih y¢gA>ACC.
                                d) Since A C B, it follows that A C B. The result then follows from part (c).
                          19, a) Forn,k € Z* with n > k + 1, consider the hexagon centered at (7). This has the form

(i-1)                     x’)
                                                                   ("1)                (i)
                                                                             (*“Z')                    (iii)
                                    where the two alternating triples— namely, (;        i). (; 44 ), (° h ') and
                                    2). CED. (et) —satisty CDG CE =                          —         n   k
                                                                                                               CED GD.
                                    b) Forn,
                                         ke Zt withn >k +1,

n—|         n    )   a+ l\ _          (n — 1)!                                n}                     (n+ 1)!            |
                                Coley                    ( k ) = [ghee                 | lap                                         Borer op
                                _         (n — 1)!              (n+ 1)!                            ni                   _   fa-l      (            (    n
                                =a                        las                         rope                             |= ( k ) cat)                   ea):
                          21.       n= 20
                          23.       The fifth, sixth, and seventh entries in the row for n = 14 provide the unique solution.
                          25.       As an ordered set, A = {x, v, w, z, y}.
                          27.       a) IfS € S, then since S = {A|A ¢ A} we have S ¢ S.
                                    b) If S ¢ S, then by the definition of S$ it follows that S ¢€ S.

Section 3.2—p. 146
                           1. a)       {1,2,3,5}     b) A    ce) andd) U-{2}                      e)     {4,8}
                              f)       {1,2,3,4,5,8}      9) -     h) {2,4,8}                i         {1,3,4,5, 8}
                          3. a) A= (1,3.4,7,9, 1}                    = (2,4, 6,8, 9}
                              b) C = {1, 2, 4.5, 9}              6185)
                           5. a) True    b) True             > True     dd) False’           e)        True
                              f) True     g) True’           h) False _ 1) False
                                . a) Let U = {1, 2,3}, A = {1}, B = {2}, and C = {3}. Then ANC = BNC =4 but A FB.
                                    b) For U = {1,2}, A = {1}, B = {2}, andC =U, we have AUC = BUC bit A # B. [From
                                    parts (a) and (b) we see that we do not have cancellation laws for N or U. This differs from what
                                    we know about R, where for a, b, ce R Gi) ab =acanda 40> b=ciand(ija+b=
                                    atco>b=c.]
                                    CO) xE€ASxEAUCDSxEBUC.Soxe         Borx eC. Ifx € B, then we are finished. If
                                    xé€C,thenx € ANC = BNC andx € B. Ineither case, x € Bso AC B. Likewise,
S-14         Solutions

ye B>yeBUC=AUC,soyeAoryeC.                      IfyeC, then ye BNC = ANC. In either
                                  case, ye AandB C A. HenceA = B.
                                  d) Let x € A. Consider two cases: J)xECSx€AACSXEBACSXEB.
                                  (2Qx€EC>BxEAACSxE              BAC Dx EB (because x ¢ C), In either case, x € B, so
                                  A © B. Ina similar way it follows that B C A and A = B.
                                  7.1
                              ~a) B=(AUB)N(AUB)N(AUB)N(AUB)                                b) A=AU(ANB)
                                ce) AN B=(AUB)N(AUB)N(AUB)  @                             A=(ANB)U(ANY)
                         13. a) LetU = {1, 2, 3}, A = {1}, and B = {2}. Then {1, 2} € P(A U B) but
                             {1,2}    P(A) UA(B).
                             b) XEP(ANB) SX CANBSs X CAand X CB <= X € P(A) and
                             X EC P(B) = XE XP(A) ON P(B), so P(A N B) = P(A) NPB).
                         15. a) 2°     b) 2"
                             c) Inthe membership table, A C B if the columns for A, B are such that whenever a | occurs
                             in the column for A, there is a corresponding | in the column for B.

a)   A\|BIC                AUB          (ANB)U(BNC)
                                       0!10|0                   1                   1
                                       o0})/o]1                 1                    l
                                       0    })1 | 0             0                   1
                                       0)1           4/1        0                   0
                                       110)0                    1                   1
                                       1/o)/1                    1                  1
                                       1/1          /0           1                  1
                                        1]1          {1          1                  1
                          17. a)     AN(B—A)=AN(BNA)=BN(ANA)=BNG=B
                                  b) [AN B)U(AN BNCND)U (ANB) = (ANB) U(ANB) by the Absorption
                                                                                            Law
                                  =(AUVUA)NB=UNB=B
                              d) AUBU(AN BNC) = (ANB) U[(ANB)NC]=
                              (ANB) U(AN B)IN[(AN B) UC] =[(ANB) UC] =AUBUC
                          19, a) [-6,9] ec) 4 e) A,  gR

Section 3.3—p. 150
                                . 55      3. 29 +28 —2° = 736         5, 914 9! — 8! = 685,440
                                . a) 241424'—22!         b) 26! — [2414 24! — 233]
                           ~~]

. (131/(2)3] — 3f12!/(2!)7] + 3(11!/2!) — 10!

Section 3.4—p. 156
                                . a) 3/8      by) 1/2    cc) 1/4.      d) 5/8       e) 5/8—s    ff): 7/8 ~~ g) «1/8
                                . 6      5. a) (8)/(7) =5/22            b) 7/22        7. 49/99
                                 . a) 1/64      b) 3/32.    ce) «15/64.      dd) 1/2_—se)s ‘11/32          11. a) 55/216       —b) 5/54
                                  ay     =      1p) 2/15) 3/35
                             mk
                          dh

. Pr(A) = 1/3, Pr(B) = 7/15, Pr(AN B) = 2/15, Pr(A U B) = 2/3; Pr(A U B) = 2/3 =
                         bot

1/3 +7/15 —2/15 = Pr(A) + Pr(B) — Pr(AMB)

Section 3.5—p. 164
                                . Pr(A) = 0.6; Pr(B) =0.7; Pr(A U B) = 0.5; Pr(AU B) = 0.5; Pr(AN B)                       = 0.2;
                                  Pr(A NB) = 0.1; Pr(AU B) = 0.9; Pr(AU B) = 0.8
                                . a) S={x, y)lx, ye {l,2,3,...,10},% Ay}                   b) 1/2      oc) 5/9
                                . 0.4    7. a) 11/21  b) 12/21)     9/21                    9. 3/16
                          11.     a) (i) 27/38      (ii) 27/38   ~—b) «() 81/361 —       ii) 18/361
                          13.     11/14        1. (7) /(*) = 330/3, 176,716,400
                                                                                                                                 Solutions           §-15

17. Since A U B CY, it follows from the result of the preceding exercise that
                         Pr(A UB) < Pr(f) =1.S801> Pr(AU B) = Pr(A) + Pr(B) — Pr(AN B), and
                            Pr(A QB) > Pr(A)+ Pr(B) -—1=0.7405-1=                                         0.2.

Section 3.6—p. 173    1. 1/4        3. (0.80)(0.75) = 0.60
                      5. In general, Pr(A UB) = Pr(A) + Pr(B) — Pr(AQ B). Since A, B are independent,
                         Pr(AN B) = Pr(A)Pr(B). So

Pr(A UB) = Pr(A) + Pr(B) — Pr(A)Pr(B) = Pr(A) +[1 — Pr(A)]Pr(B)
                                                   Pr(A) + Pr(A)Pr(B).

The proof for Pr(B) + Pr(B) Pr(A) is similar.
                       7,a) 52/85 ~~ —b) 11/26           9. 3/7
                     11. Pr(ANB) = 1/4 = (1/2)(1/2) = Pr(A)Pr(B), so the events A, B are independent.
                     13. 1/5      15. (0.05)(0.02) = 0.001          17. 5/21
                     19. Any two of the events are independent. However, Pr(AN BOC) = 1/4 # 1/8 =
                         (1/2)(1/2)(1/2) = Pr(A)Pr(B)Pr(C), so the events A, B, C are not independent.
                     21. a) 5/16      ~—ib) «11/32 se) s 11/32      23. 0.6
                     25. a) 2° — (3) — (7) =26        Ob) 2"°- (3) — (7) = 2" —(n+1) 27. 30/77        29, 0.15

Section 3.7—p. 185
                       la)     1/4     bye)            7/8          a) 3/4          e) 2/7         f) 1/2
                                                        _110
                       3. a)   Pr(X =x) = CHS")
                                          —                      ,x =0,1,2,3,4,5.
                                                       ‘ll

b) Pr(X =4)=               ‘) = 275/2,268,786
                         ce) 139/1,134,393     5h              2675/8796
                      §, a) 2/3      b) 2/3     ce)            1/4         +d) 7/2     e) 35/12
                        -a)c=1/15          b) 3/5                 ec) 7/3-~—s    dd) 14/9       9. n = 200, p = 0.35
                      ~]

11. a) (0.75)®§ = 0.100113                  ib) (§) (0.25)3 (0.75)° = 0.207642
                            c) 5°8_, (8)(0.25)* (0.75)              = 0.004227
                         d) 0.037139 (approximately)   e) 2    f) 1.5
                     13. c= 10       15. a) Pr(X = 1) = 1/5; Pr(X = 2) = 16/95; Pr(X = 3) = 12/19
                         b) 7/19    e¢) 19/35 = d) 231/95 = 2.431579    ee) 5824/9025= 0.645319

17, a) E(X(X —1)) = Dox — DPr(X =x) = Dx                                               — I) Pr(X = x)
                                                  x=0                                        x=2

-                        n                  =             n!
                                              = ore
                                                s
                                                    - »( x pra                            =e
                                                                                          =| x!(n — x)!
                                                                                                                        > Dg
                                              _    =                 nl               x on-x —       ,2           _              (n — 2)!         xX-2   n-x

“LES Dia =i"                                     ~ en                   D2 aan                 0”
                                              = p’n(n — 1) so                   vin (n2G
                                                                                       _ 2)!
                                                                                      a2)  2 DI pq”vy ,n—-(y+2)        Mer aes
                                                                                                           °**, — substituting x — 2 =_— y,
                                                                      y=O       ~

== p’n(n _ 0.                     Il (1—2)
                                                                                eT    — aa—        n? y4 (n—2)-3

= pn(n—1)(p+q)"*,                         _ by the Binomial Theorem
                                              = p?n(n — 1)(1)"? = penn — 1) = 0? p? — np?
                            b) Var(X) = E(X)? — [E(X)? = [E(X(X — 2D) + ECO] - [EOP                                          =
                            [(n?p> — np?) + np] — (np)? = n? p? — np? + np — nn? p? = np — np? = np(1 — p) = npg.
S-16         Solutions

19. a) Pr(X =2) = 1/4; Pr(X = 3) = 1/8; Pr(X = 4) = 1/4; Pr(X =5) = 1/4;
                             Pr(X =6)=1/8       b) 31/8   e) 119/64
                         21. E(X) =4; 0x =1

Supplementary
Exercises—p. 189          1. Suppose that (A — B) CC andx e A— C. Thenx € A butx ¢ C. Ifx ¢ B, then
                             [xe AAx¢ B]      > x €(A—B)CC.Sonow    we have x ¢ C andx € C. This contradiction
                             givesusx € B,so(A-—C)CB.
                                Conversely, if (A —C) C B,letye A— B.Theny ¢ A buty ¢ B. Ify ¢ C, then
                             [ye AAy¢C] > ye (A-—C) CB. This contradiction    — that is, y ¢ B and y € B — yields
                                yeC,so(A—B)CC.
                          3. a) The sets U = {1, 2, 3}, A = {1, 2}, B = {1}, and C = {2} provide a counterexample.
                                b) A=ANUW= AN(CUC)   =(ANC)U(ANC)
                                                               = (ANC) U(A-C)
                                    =(BNC)U(B—-C) =(BNC)U(BNO)=BN(CUC)=BNU=B
                              a) 126 (if teams wear different uniforms); 63 (if teams are not distinguishable)
                         Sa

112 (if teams wear different uniforms); 56 (if teams are not distinguishable)
                              b) 2” — 2; (1/2)(2” — 2). 2” —2 —2n; (1/2)(2" —2—2n),
                            . a) 128 ~~ b) JA =8
                         e-~l

. Suppose that (AN B) UC = AN (BUC) and thatx € C. Then
                                xXxECSXE(ANBUC>SXEAN(BUC)CA,soxe A,andC CA.
                                 Conversely, suppose that C C A.
                                  (1) Ifve (AN B)UC,then    ye ANBorvec.
                                       i) yEANBSYE(ANBVU(ANC)SyEAN(BUC).
                                      (ii) ye C> yeEA, because C CA. Also, ye C>yeEBUC.SoyeAN(BUC).
                                  In either case (i) or case (ii), we have yE AN (BUC), so(ANB)UCCAN(BUC).
                                  (2) Nowletze         AN(BUC).
                                                          Thenze AN(BUC)=(ANB)U(ANC)C(AN
                                                                                     BUC,
                                  since ANC      CC,
                                  From parts (1) and (2) it follows that (AM B)UC=AN(BUC).
                         11. a)   [0, 14/3]     _b) {0}U(6,12]       c¢) [0,+00) d) &@

13. a)    A|B                 ANB          Since A C B, consider only rows 1,
                                                                    2, and 4. For these rows, AN B=    A.
                                   0       0             0
                                   0       1             0
                                   1       0             0
                                   1       1             1

Ola)       BC}          (AnB)u(BnG            ANG — ForCC BCA, consider only
                                   7       0     0              0                0         rows 1, 5, 7, and 8. Here
                                   0       0     i              0                0         (AN   B)U(BNC)=ANC.

0       1     0              1                0
                                   0       1     1              0                0
                                       1   0   0}               1                1
                                       1   0     1              1                0
                                       1   1     0              1                ]
                                       1   1     l              0                0
                                                                                                                                                      Solutions      S-17

d)     A|BIC                             AAB          AAC              BAC                  When A A B = C, we consider
                                                                                                                                  rows 1, 4, 6, and 7. In these cases,
                                            00    )0;0 | 90 1                  00)           01                 01                AAC=BandBAC=A.
                                                 0o};1 |0                       1            0                  1
                                            0 | 1       l                       1            1                  0
                                            1   |0 | 0                          ]            l                  0
                                             1    0         l                   1            0                   1
                                             I    1/0                          0             |                  1
                                             l    1   1                        0             0                  0
                              15,    a) (*')          (m<rtl)                    b) (ET!)              kent
                              17. a) 23.          b) 8            19, 7'5 — 3(3'5) +3                        24. (7) (2)/(G) = 0.3483
                              23.    a) Dito (7) (6°)=                        Lede ie
                                     bd (GV Lhe E)GT)] @ (QM) Leo O67]
                                     cit) [(9) + (AV) + OC) + OO) + OO] / Loko GG)]
                              25.    AU B = [-2, 4], AN B = {3}   27. 135/512 = 0.263672
                              29.    Pr(AN(BUC)) = Pr(ANB)U(ANC)) =
                                     Pr(ANB)+ Pr(ANC)— Pr(ANB)N(ANC)). Since A, B, C are independent and
                                     (AN B)N(ANC) =(ANAJN(BAC)=ANBNC, Pr(AN(BUC)) = Pr(A)Pr(B)+
                                     Pr(A)Pr(C) — Pr(A)Pr(B)Pr(C) = Pr(A)[Pr(B) + Pr(C) — Pr(B)Pr(C)] =
                                      Pr(A)[Pr(B) + Pr(C) — Pr(B OC)] = Pr(A)Pr(B UC), so A and B UC are independent.
                              31.    a) 0.99          b) (0.99)? = 0.970299                           33. 3/5
                              35.     (3) (0.8)3 (0.2)? + (7) (0.8)4(0.2) + (2) (0.8)° = 0.94208
                              37.    675 /2048             39. a) c= 1/50                   b) 0.82                  c¢) 13/41        d) 2.8      e) 1.64
                              41.    a) 3/(%)          ») [(")-3]/@)                        9 BAM -3/@)
                              43.    2/[m(m + 1)]
                              45.    a) Pr(X = 1) =7/16; Pr(X = 2) = 3/8; Pr(X = 3) = 3/16
                                     b) 7/4   ¢) ox = 3/4

Chapter 4
     Properties of the Integers: Mathematical Induction
Section 4.1—p. 208
                                1.    b) Since 1 - 3 = (1)(2)(9)/6, the result is true for n = 1. Assume the result is true for n =
                                      k(>1):1-342-443-54---+k(K +2) = k(k + 1)(2k + 7)/6. Then consider the case
                                      forn =kK4+1:[1-342-4+---+kK4+2))+                   44+ DK +3) = (kK + DCR +7)/6) +
                                      (k + 1)(k +3) = [(k + 1)/6][K(2k + 7) + 6(k + 3)] = (kK + 1)(2k? + 13k + 18)/6 =
                                      (k + 1)(k + 2)(2k + 9)/6. Hence the result follows for all n € Z* by the Principle of
                                     Mathematical Induction.
                                                                  l                    n
                                     ©) Sim):         dX i@+l) n+l
                                                       1
                                                                 1              1     1
                                          Sq);                                = —_= —__. 509 S(1) is true.
                                            ()        DG)                            12)    1+1             °° (1)
                                                                                                                is true
                                                                          k                       k
                                          Assume S(k):                        Ga
                                                                               Oy          =F           _ Consider
                                                                                                              S(k + 1).
                                                                      1=1

k+1
                                                                      A                                 1                        k                1
                                                 aE               apt
                                                                  =]            (k + 1)(k +2) ~ k4D  &+DE+2
                                                                = [k(k +2) + 1/[(K + IK 4+2))= K+ 1/4 4+ 2),
S-18   Solutions

so S(k) => S(k + 1) and the result follows for all n € Z* by the Principle of Mathematical
                       Induction.
                     . a) From 0), PF +(nt- 13 = 7H                                 43P 438 4)N ="                  P43         P+
                       350    i=1        i+        50",
                                                   1=0 1, we have
                                                               (n +12 =3                9°". 7? 4+35°,_,ii=]
                                                                                        i=)                      + M41). Consequently,y.

350i? =F + 3n? +30 t 1) —3ly(n + 1)/2]-n-1
                                                          l=       n3 +4 (3/2)n? + (1/2)n
                                                                 = (1/2)[2n3 + 3n? +n] = (1/2)n(2n? + 3n + 1)
                                                                 = (1/2)n(n + 1)(2n + 1), so

S~F_, 2 = (1/6)n(n + 1)(2n + 1) (as shown in Example 4.4).
                       b) From )0"_, 4+ (n +14 = O_o + Dt = "G4 4+ 408 + 67? 4 47 +:1) =
                       yy      44      P46       72 4+4 07,1 + 35), 1, it follows that (n + 1)4 =
                       Ayr, P4677? +450", 8+ 52%, 1. Consequently,

45            8 =(n4 1) — 6[n(n + D(2n + 1)/6] — 4In(@ + 1/2] — m +:1)
                                          lS         nt 4 4n3 + 6n? +4n +1 — Qn? + 3n? +n) — (Qn? +2n)-—(n +1)
                                                    H=n4wWtr =n(n? +2n+l=an*(n+ 1°.

So )0"_, i = (1/4)n?(n + 1)? [as shown in part (d) of Exercise 1 for this section].
                          From 37), + (2+ 1° = yg + YD? = O7_ (+ 5i4 + 1003 + 1077 + 58 +1) =
                       ye P4507                            +10             +10                2 4+5007.,14+ 07, 1, we have
                       57                4 =F            1) — 10/4)n?(n + 1)? — (10/6)n(n + D(2n + 1) — (5/2)n(n +1) —
                       (n+ 1). So

5 So i* = n° +5n* + 10n* + 10n? +5n +1 — (5/2)n"
                                    il              — 5n3 — (5/2)n? — (10/3)n3 — Sn? — (5/3)n — (5/2)n? — (5/2)n —n — 1
                                                  = n> + (5/2)n* + (5/3)n> — (1/6)n.

Consequently, )-*_, i* = (1/30)n(n + 1)(6n? + 9n? +n — 1).
                     - a)    7626 ~~ —b) 627,874                   7. n=       10      9. a)     506     b)     12,144
                   11. a)     ie          fy =        = Goch        ~      a1?        +i) =2      eis    i? +     ie     t=
                       2[(n)(@ + In + 1)/6] + [na + 1)/2] = [Inf + 1)Qn + 1)/3) + [nm 4+ 1)/2] =
                       n(n + DAE + $] = n@ + DE] =n + In + 5)/6.
                       b) 52) #%; = 100(101)(405)
                                                /6 = 681,750.
                       c) begin
                                    sum       :=0
                                    for       i    :=1to100        do
                                          sum
                                           := sum+               (2* i)    *    (2*   i+1)/2
                                print              sum
                             end

13. a) There are 49 (= 7°) 2 X 2 squares and 36 (= 6’) 3 X 3 squares. In total there are
                       1742? +37 4---+8? = (8)(8 + 1)(2-8 + 1)/6 = (8)(9)(17)         /6 = 204 squares.
                       b) For each 1 <k <n the n X n chessboard contains (n — k + 1)? k X k squares. In total there
                       are 1° 4-274 3° 4+.-.-4+r2=n(n+1)Qn+ 1)/6 squares.
                   15. Forn = 5, 2° = 32 > 25 = 5”. Assume the result for n = k (> 5): 2* > k*. Fork > 3,
                       k(k—2)> lork?>2k+1%>kRro2M+2% sR +k oats P+ ke > hk? 4+ (2k4+1)
                        = (k + 1)*. Hence the result is true for n > 5 by the Principle of Mathematical Induction.
                                                                                 Solutions       S-19

17. b) Starting with nv = 1 we find that
                I
           S- JH, = A = 1=[(2)0)/21G/2) — [(2)0)/4] = [(2)C)/212 — (2) 1/41.
           j=l

Assuming the truth of the given (open) statement for n = k, we have
                                      k

S> FA, = [kK + (K)/21 Aes — [ke + 100/41.
                                     y=

Forn = k + 1 we now find that
          k+l                k
          >         JH, =          JH, + (Kk + WI) Mei
          j=l               i =I

= [(K + 1)(K)/2] Aa         — 1K + 1K) /4) + (K+ 1) Ae
                        = (kK + ILL + (K/2)) Ais — [A + 1K) /4]
                        = (k + Il + (k/2) [Ase — /(k + 2))1 — 1K + 1)()/4]
                        = [kK + 2) + 1)/2) Arye — LK + IK + 2)1/[12 + 2)) - [& + )/4]
                        = [(k + 2)(K + 1)/2) Air — (0/9 [2k + 1) + kK + D1
                        = [(k + 2)(k + 1)/2] Aisa — [A + 2)(k + 1/41].
    Consequently, by the Principle of Mathematical Induction, it follows that the given (open)
    statement is true for all n € Z*.
19. Assume S(k). For S(k + 1), we find that        }°4*) i = ((k + (1/2)? /2) + (kK +1) =
    (ko +k + (1/4) + 2k +. 2)/2 = (kK +1)? + (kK +194 /4)/2 = [& + 1) + C1/2)]?/2. So
    S(k) => S(k + 1). However, we have no first value of k where $(k) is true: for all
    k>1, 9°48, i = (k)(K +1)/2 and (k)(K + 1/2 = [k + (1/2)?             /2 > 0 = 1/4.
21. Let S(n) denote the following (open) statement: For x, n € Z*, if the program reaches the top
    of the while loop, after the two loop instructions are executed n (> 0) times, then the value of
    the integer variable answer is x(n!).
        First consider S(1), the statement for the case where n = 1. Here the program (if it reaches
    the top of the while loop) will result in one execution of the while loop: x will be assigned the
    value x - 1 = x(1!), and the value of will be decreased to 0. With the value of n equal to 0, the
    loop is not processed again and the value of the variable answer is x(1!). Hence S(1) is true.
        Now assume the truth for n = k (> 1): For x, k € Z*, if the program reaches the top of the
    while loop, then upon exiting the loop, the value of the variable answer is x(k!). To establish
    the truth of S(k + 1), if the program reaches the top of the while loop, then the following occur
    during the first execution:

The value assigned to the variable x is x(k + 1).
       The value ofn is decreased to (kK + 1) -—1=k.

But then we can apply the induction hypothesis to the integers x(k + 1) and k, and upon exiting
    the while loop for these values, the value of the variable answer is (x(k + 1))(k!) = x(k + 1)!
        Consequently, S() is true for all n > 1, and we have verified the correctness of this program
    segment by using the Principle of Mathematical Induction.
23. b)    24=54+54+7+7                   25=54+545454+5                26=54+7+747
          27=54+54+54+5+7                28 =74+74+74+7
    Hence the result is true for all 24 < n < 28. Assume the result true for 24, 25, 26, 27, 28,..., k,
    and considern = k + 1. Sincek + 1 > 29, we may writek + 1 = [(k + 1) —5]+5=
    (k — 4) +5, where k — 4 can be expressed as a sum of 5’s and 7’s. Hence k + 1 can be
    expressed as such a sum and the result follows for all n > 24 by the alternative form of the
    Principle of Mathematical Induction.
5-20         Solutions

“           i\<    1\[nm+1))    n41
                         25.         Bo = DxPrnx=0) = ox (2) = (2) v= (5) jar      |=   5
                                                            *                                  x=1
                                                                                                                     1
                                                      n =(7)de=(7) [|
                                    cay= Terrxea= De (Z)    x                                   x=
                                                                                                                                  i      1                n                    6

_ @+IQn+1)
                                                                            6
                                                               )j@ntl)—                                                               (a+ily =n)                      [A           18)
                                    Var(X) = E(X°) -[E(X)P _(+
                                                                                              6                                    4                                       6         4
                                                  ~                 ely         4n4+2-—(3n+3 )]_                           @ty
                                                                                                                           n+l)m—-1)            _     n-1
                                                            12                   12            12
                         27. Let T = {n € Z*|n > no and S(n) is false }. Since S(no), S(to + 1), S(to +2), ... , S(m)) are
                             true, we know that 79, my9 + 1, ny +2,..., n, €T.IfT # G, then T has a least element r,
                             because T C Z*. However, since S(no), S(#p + 1), .... Sr — 1) are true, it follows that S(r)
                             is true. Hence 7 = G and the result follows.

Section 4.2—p. 219
                          1. a)      ¢) = 75 Ona)                    = Cn +7,           forn > 1.               b)       c) = 7: Cn4,    = 7c, form > 1.
                             ¢) ¢) = 10; Cha) =e, +3, forn > 1.      d) cy = 7s cng) = Cy, forn > 1.
                          3. Let T(n) denote the following statement: For n € Z*, n > 2, and the statements
                               P;    Gi,    G2,       ore       >   Gn>

PV (41 Ng2 A+++ A Gn)                                (PV G1) A (PV G2) A+                       ACP Y Gn):
                               The statement T (2) is true by virtue of the Distributive Law of v over A. Assuming 7 (k), for
                               some k > 2, we now examine the situation for the statements p, g), G2, ..- . Gx, Gx41. We find
                               that
                                  p V (qi Aqz2 A+++
                                                  A Gg A Gust)
                                                                            =      PVG         AGA                   Ag) Agri]
                                                                            = (PV GI AG2 NAGA         Y Fev)
                                                                            = [PV a) A(PV qa) A+++ A(PV GIA (PY Gest)
                                                                            <> (PVG) APY G2) A+++ AC PY 9k) A (PY Get):
                               It then follows by the Principle of Mathematical Induction that the statement 7 (n) is true for all
                               n>      2.
                          5.   a)     (i) The intersection of A,, A> is Ay M Ad.
                                   (ii) The intersection of A, A2,..., An, Anyi iS given by Ay M A2M-->N An OM Anas =
                                   (A, NA2N-+>- MA) O Anyi, the intersection of the two sets A) 1 A2M---M A, and A,4).
                               b) Let S(n) denote the given (open) statement. Then the truth of $(3) follows from the
                               Associative Law of N. Assuming 5(k) true for some k > 3, consider the case for k + 1 sets.
                                   (1) Ifr = k, then

(A,       MA2M---7             Ax)    M        Aga     =   A;   MA2N+-+         MAM         Aga,

from the recursive definition given in part (a).
                               (2) For 1 <r <k, we have

(A, MN A29-+ ++ OA)                      9 (Apa       - ++ 7 ARO Ags)

= (A; M1 A2N- +> MA,) O [Ara                            +++         Ag)       Agar]
                                                                                = [(A, M1 A2N- +            ALO                  Arg 9         OAR              Agi
                                                                                = (A,     Ad         -=     A, Ap) 1                          AR)         Aga
                                                                                = Ay MA2M+*-               NA,           MA pa    A      NARA         Aga,

and by the Principle of Mathematical [nduction, $(m) is true for all                                                         > 3 and all l <r <a.
                                                                                                                                   Solutions            §-21

7    For n = 2, the truth of the result A M(B, U B,) = (AM B,) U(AN B;) follows by virtue of the
      Distributive Law of M over U. Assuming the result for n = k, let us examine the case for the sets
      A,   B,,   Bo,   +,         Be,        Bia.       We     have       AN     (B;   UB,U---UB,              U     By ))     =ANM[(B,        UB,   U.-.--

U By) U Bei] = (AN (BU By U-                                             UO BU        CAN Best) = LAN BY) U (ANB) U---U
       (AN By) U(AN Bey) = (ANB) U(AN BU:                       -U(AN B,) UCAN B,,)). So the result
       is true for all n > 2, by the Principle of Mathematical Induction.
     . a) (i) Forn = 2, the expression x;x2 denotes the ordinary product of the real numbers x; and x.
           (ii) Letn € Z with n > 2. For the real numbers x,, x2, ... , Xn, Xn41. We define

XjXQ     +    XyXnyl      =   (Xx) X2   see   Xn )Xn4     ;

the product of the two real numbers x;x2 +> - x, and X,41.
      b) The result holds for n = 3 by the Associative Law of Multiplication (for real numbers). So
      X1 (423) = (x) x2).x3, and there is no ambiguity in writing x,x2.x3. Assuming the result true for
      some k > 3 andall 1 <r <k, let us examine the case for k + | (> 4) real numbers. We find that
      (1) ifr =k, then (41x32 +++ x) Xy4) = Xp Xo ++ + X_~X~41 by the recursive definition given in part
      (a); and (2) if l <r <k, then (px. +++ xy) (erg    KKK) = OK      He) (Org                                                                 HEEL)
      = (C0 X20 Xp Vp           Ae) Kee = OM          AK     Kee = KIKD     MeN                                                                   KEKE
    so the result is true for all n > 3 and all 1 <r <n, by the Principle of Mathematical Induction.
11. Proof (By the Alternative Form of the Principle of Mathematical Induction): For n = 0, 1, 2 we
    have

(n=0)                dos. = a) = 1 > (V2)°;
                                        (n=1))               digo = 43 =a) +9 = 2> V2 = (V2)!;                                  and
                                        (n=2)                oy) = 44 =a3
                                                                        +a; =241=3>2= (V2).
      Therefore, the result is true for these first three cases, and this gives us the basis step for the
      proof.
          Next, for some k > 2, we assume the result true for allm =0,1,2,...,k. Whenn =k +1
      we find that

Auety42 = Aes = dese tay > (V2) + (VS2? = [029° + V2)?
                               = 3(S 2)? = (3/2)(2)(/ 2)? = (3/2) (72) = 2)",
    because (3/2) = 1.5 > J/2. (= 1.414). This provides the inductive step for the proof.
        From the basis and inductive steps it now follows by the alternative form of the Principle of
    Mathematical Induction that a,.2 > (/2)" for all n EN.
13. Proof (By Mathematical Induction):
    Basis Step: When n = 1 we find that
                                         1
                                               F; 1-1                                                         FP;                Fi42
                                              ——=                     F/2-0-=1-(2/2)-1-—-—--1-
                                    »           2!              o/                         (2/2)               2                  21
                                    1=1

so the result holds in the first case.
      Inductive Step: Assuming the given (open) statement true for n = k, we have
           kK, fot = 1 — “2 When n = k + 1, we find that

F_ 1 -y AF ty                           F Fe 2           _ F, Fen          Fk
                                         2              DetL                     Ik    +   Qk+1
                            i=l

=14          (1/2*)
                                         [FR — 2Fa2) = 1+ 0/24) [CR - Fea2) — Fra]
                         1+ (1/2**!)[—Fyay — Fagg] = 1 — 0/2")
                                                             Pa                                                            + Faga) = 1 — (Figs /2**3).
      From the basis and inductive steps it follows from the Principle of Mathematical Induction that

Vn           Zt DS 0(F 1/2) = 1 ~ Fas2/2").
                                                                           i=l
§-22         Solutions

15. Proof (By the Alternative Form of the Principle of Mathematical Induction): The result holds
                             for n = 0 andn = | because

(n=0)         5Fou2 =5h%=50)               =5=7-2=                  Le — Lo = Loss — Lo;             and
                                           (n   = 1)     SFiy4.   = 5F3    =   5(2)   =     10 =   11    -1=     £5    -— Lh,   = Lig   — Ly.

This establishes the basis step for the proof.
                                    Next we assume the induction hypothesis    — that is, for some k (> 1), S5Fy42 = Lng — Ly
                                for alln =0,1,2,...,k —1,k. It then follows that forn =k + 1,

SFecgiy42 = SFea3 = SCP ea. + Pea) = SCPa42 + Fae-ty42) = SPege + F142
                                                       = (Lisa — La) + (Lea -tyg4 — bee) = (hiesa — Be) + (hea — Li-1)
                                                       = (Liga + Lesa) — (ha + bx) = Leas — Leg = Lesijsa — Levi,
                                where we have used the recursive definitions of the Fibonacci numbers and the Lucas numbers
                                to establish the second and eighth equalities.
                                    In then follows by the alternative form of the Principle of Mathematical Induction that

VneéeN          5 Fna2   =    Lnsa   — Ly.

17. a) Steps                                          Reasons
                                1) p. gr. T                                    Part (1) of the definition
                                2) (pV q)                                      Step (1) and part (2-ii) of the definition
                                3) (—r)                                        Step (1) part (2-i) of the definition
                                4) (TT) A (-1r))                               Steps (1) and (3) and part (2-iii) of the definition
                                5) (pv q) > (% A (77)))                        Steps (2) and (4) and part (2-iv) of the definition
                         19.    a) (4) + (85') = [kK -— 1/214 ke 4+ Dk/2) = (Pk +R +K)/2 = R?.
                                ce) §) +463") + (697) = [kk — DK — 2)/6] + 41K + DK) — 1/6] + [k +2)-
                                (k + 1)(k)/6] = (k/O[(k — Ik — 2) +4 + Dk - D+     420k + 12] = (k/6) [6k] =.
                                ey = (+E          G')1 + EG) + (9)
                                In general, k’ =        }°'=) a,,,(* 7"), where the a,,,’s are the Eulerian numbers of Example 4.21
                                (The given summation formula is known as Worpitzky’s identity.)

Section 4.3—p. 230
                               . e) Ifa|x anda|y, thenx = ac and y = ad for some c, d € Z.Soz = x — y = a(c
                                                                                                         — d), and
                                 a|z. The proofs for the other cases are similar.
                                 g) Follows from part (f) by the Principle of Mathematical Induction.
                               . Since g is prime, its only positive divisors are | and g. With p a prime, it follows that p > 1.
                                 Hence p|g > p=4q.
                               . Proof (By the Contrapositive): Suppose that a|b or a|c. If a|b, then ak = b for some k € Z. But
                                 ak = b= (ak)e = a(ke) = bc => abc. A similar result is obtained if a|c.
                               . a) Leta = 1, b =5,c = 2. Another example is a = b = 5,c =3.
                                b) Proof: 31\(5a + 7b + 11c) = 31|(10a + 14b + 22c). Also, 31|(31a + 316 + 31c), so
                                31|[Gla + 31b + 31c) — (10a + 14b + 22c)]. Hence 31|(21a + 17b + 9c).
                               . [bla and b|(a + 2)] = b\[ax + (a + 2)y] for all x, y € Z. Letx = —1, y = 1. Then’ > 0 and
                                 b|2, sob = 1 or 2.
                         il. Let aq = 2m + 1 andb = 2n + 1, for some    m,n EN. Thena? + 6? =4(m? +m +n? +n) 42,
                             so 2|(a? + b*) but4 f(a? +b’).
                         13. For n = 0 we have 7” — 4” = 7° — 4° = 1 — 1 = 0, and 30. So the result is true for this first
                                case. Assuming the truth for n = k (> 0), we have 3|(7* — 4*). Turning to the case for
                                n =k +1, we find that 7**! — 44+! = 7(7*) — 4(4*) = 3 4+. 4)(7%*) — 4(4) = 307) +
                                4(7* — 44). Since 3|3 and 3|(7* — 4*) (by the induction hypothesis), it follows from part (f ) of
                                Theorem 4.3 that 3|[3(7*) + 4(7* — 4*)], that is, 3](7*+! — 4**'), It now follows by the
                                Principle of Mathematical Induction that 3|(7" — 4”) for alla €N.
                                                                                                           Solutions   §-23

15.          Base 10               Base 2         Base 16
                            a)         22                  10110            16
                            b)        527          1000001111             20F
                            ce)     1234          10011010010             4D2
                            d)      6923       1101100001011             1BOB
                     17.            Base 2      Base 10          Base 16       19. n = 1,2, 3, 6,9, 18
                            a)    11001110          206             CE
                            b)    00110001           49             31
                            c)    11110000          240             FO
                            d)    01010111           87             57
                     21.          Largest Integer        Smallest Integer
                           a)       7=2-1                  —8 = —(23)
                            b)       127=2'-1              —128 = —(2’)
                            c)     215    —]             —(2)5)
                            d)     931    _    y         —(23!)
                            e)     gr-l       —]         —(2"-!)

23. ax = ay > ax —ay =0=> a(x — y) = 0. In the system of integers, if b, c € Z and bce = 0,
                         then b = 0 or c = 0. Since a(x — y) = O anda # 0, it follows that (x — y) = Oandx = y.
                     29. a) Since 2|10' for all t € Z*, 2\n if and only if 2|ro.
                         b) Follows from the fact that 4|10’ for ¢ > 2.
                         c) Follows from the fact that 8|10' for t > 3. In general,

2'*"\n if and only if 2'*'|(r, - 10° +---+r,-10+ 79).

Section 4.4-p. 236
                           . a) gcd(1820, 231) = 7 = 1820(8) + 231(—63)
                             b) gced(2597, 1369) = 1 = 2597(534) + 1369(—1013)
                             c) gcd(4001, 2689) = 1 = 4001(—1117) + 2689(1662)
                           . gcd(a,b) =d>d =ax 4+ by, forsome x, ye Z
                             gcd(a, b) =d=>a/d,b/deZ
                              1 = (a/d)x + (b/d)y = ged(a/d, b/d) = 1.
                           . Proof: Since c = gced(a, b) we have a = cx, b = cy for some x, y € Z*. So ab = (ex)(cy)     =
                             c?(xy), and c” divides ab.
                           . Let ged(a, b) = h and gcd(b, d) = g.
                             gcd(a, b) =h => [h|a andh|b] > h|(a- 14+ bce) > hid.
                             [A|b and h|d] > hAlg.
                            acd(b, d) = g => [g|b and g|d] => g\(d-1+b(—c)) = gla.
                             [elb, gla, andh = gced(a, b)] > glh.hlg, g|h, withhe, he Z        >ag=h.
                           . a) Ifc € Z*, then c = ged(a, b) if (and only if)
                                  (1) cla and c|b; and
                                  (2)VdeEZ [(dla) A (d|b)]> adc.
                            b) Ifc € Z*, then c # gced(a, b) if (and only if)
                                  (l)cfaorcy
                                          b; or
                                  (2) ad € Z [(dla) A (d|b) A (df ©)).
                     11. gcd(a, b) = 1 > ax + by = 1, for some x, y € Z. Then acx + bey = c. alacx, albcy (because
                         albc) > alc.
                     13. We find that for any n € Z*, (5n + 3)(7) + (7n + 4)(—S) = (352 +21) — (35n + 20) = 1.
                         Consequently, it follows that gcd(5n + 3, 7n + 4) = 1, or Sn +3 and 7n + 4 are relatively
                         prime.
                     15. One $20 and 20 $50 chips; six $20 and 18 $50 chips; eleven $20 and 16 $50 chips.
                     17, There is no solution for c # 12, 18. For c = 12, the solutions are x = 118 — 165k, y = —10+
                          14k, k € Z. For c = 18, the solutions are x = 177 — 165k, y = —15 4+ 14k, k €Z.
                     19, b = 40,425        21. ged(n,n +1) = 1; Iem(n,n+ 1) = n(n +1)
§-24         Solutions

Section 4.5—p. 240
                               a) 27.3°.53-11      ~~ b) 24.3-57- 7-11?           ge)       37-59-7113
                               a) m? = pi"! ps? ps?       pr          b) m? = py! ps? pi? - -- pi

ee
                               (The proof is similar to that given in Example 4.41.) If not, we have ./p = a/b, where
                               a, be Z* and ged(a, b) = 1. Then /p = a/b => p =a’? /b’ => pb’ =a’ = pla* => pla (by
                               Lemma 4.2). Since p|a we know thata = pk for some k € Z*, and pb* = a? = (pk)? = p*k?,
                               or b? = pk*. Hence p|b? and so p|b. But if pla and p|h, then gcd(a, b) > p > 1—
                               contradicting our earlier claim that gcd(a, b) = 1.
                               a) 96      b) 270     c) 144        9, 660      11. There are 252 possible values for n.
                         13.   a) Proof: (i) Since 10|a? we have 5|a* and 2|a?. Then by Lemma 4.2 it follows that 5|a and
                               2\a. Soa = 5b for some b € Z*. Further, since 2|5b we have 2|5 or 2|b (by Lemma 4.2).
                               Consequently, a = 5b = 5(2c)     = 10c, and 10 divides a.
                                 (ii) This result is false —let a = 2,
                             b) We can generalize section (i) of part (a) by replacing 10 by an integer n of the form
                             Pip2-+++ P,, a product of ¢ distinct primes. (So 7 is a square-free integer— that is, no square
                             greater than 1 divides n.)
                         15, 176,400         17. n=2-3-5°- 7? = 7350
                         19, a)5        b)7      oc) 32.)   d)74+74+54+254+20420=84               e) 84
                         21. 1061 (= 512 + 256 + 293)
                         23. a) From the Fundamental Theorem of Arithmetic 88,200 = 23 . 3? . 5? - 77, Consider the set
                               F = {23, 3°, 5°, 7°}. Each subset of F determines a factorization ab where gcd(a, b) = 1.
                               There are 2* subsets  — hence, 2* factorizations. Since order is not relevant, this number (of
                               factorizations) reduces to (1/2)2* = 23. And since 1 <a <n, 1 <b <n, we remove the case
                               for the empty subset of F (or the subset F itself). This yields 2* — 1 such factorizations.
                               b) Here n = 23 - 3° -5?.7*- 11 and there are 2* — 1 such factorizations.
                               c) Suppose that n = pj! p;°--- p;*, where p;, p2,..., px are k distinct primes and
                               Ni, M>,..., My, > 1. The number of unordered factorizations of n as ab, where
                               l<a<n,1<b<«<n,and ged(a, b) = 1, is2*-! — 1.
                         25. Proof : (By Mathematical Induction): For n = 2 we find that
                                  ro (1-4) = (1-4) = (1— 4) = 3/4 = 2+ 1)/(2 - 2), so the result is true in this first
                               case, and this establishes the basis step for our inductive proof. Next we assume the result true
                               for some k € Z* where k > 2. This gives us [It               (1- *) = (k + 1)/(2k). When we consider

(2) (N(-2) Oat
                               the case for n = k + 1, we obtain the inductive step for we find that

1           k+1        k+1)?-1
                                                      = [ik + )/28)] 1          a
                                                                                (k +1)?
                                                                                        |=| 2k+              ae
                                                                                                              (k +1)?
                                                      _ Qk
                                                         Kt (k+2k
                                                                +1)      = (k+2)/2(kK
                                                                              +   2)/(2(K+ +1)        ((K=(K4+1     1)/Q2(k + 1)).
                                                                                                            + 1) + 1)/@2¢k     +1

The result now follows for all positive integers n > 2 by the Principle of Mathematical
                             Induction.
                         27. a) The positive divisors of 28 are 1, 2, 4, 7, 14, and 28, and 1+2+4+7414+428 = 56=
                             2(28), so 28 is a perfect integer. The positive divisors of 496 are 1, 2, 4, 8, 16, 31, 62, 124, 248,
                             and 496, andl] +2+4+8+4           16+31 +4624 124+ 248 + 496 = 992 = 2(496), so 496 is a
                             perfect integer.
                             b) It follows from the Fundamental Theorem of Arithmetic that the divisors of 2”°-1(2” — 1),
                             for 2” — 1 prime, are 1, 2, 2*, 23,..., 2-!, and (2 — 1), 2(2" — 1), 2?(2" — 1),
                               23(2" —1),..., and 2”~'(2” — 1). These divisors sum to [1 + 2 +2? +2? 4.---+2"-!] 4
                               (2" —1)[1+2+4+2°4+2?4---4+2"™'] = (2"-14+@2"-DQ"-)=
                               (2” — 1)[1 + 2” —1)] = 272"        -—1) =    2(2"' 2" — 1)], so 2”~!(2” — 1) is a perfect integer.
                                                                                                                          Solutions         §-25

Supplementary
Exercises —p. 245   1l.a4+(a+d)+(a4+2d)+---+(a4+(n—1)d)                 =na + [(a — Iad)/2. Forn = 1,a =
                     a +0, and the result is true in this case. Assuming that

k

> [a+ i — ld] = ka + [(k — lkd]/2,
                                                            i=]

we have

k+l

Sila + @ = Dd] = (ka + (= Ikad]/2) + (a + kd) = (K+ Va F [k(k + D)dd]/2,
                                 i=]

so the result follows for all n € Z* by the Principle of Mathematical Induction.
                     » Conjecture: ¥°"_,(-V'*1i? = (-1)"*! 0"_, i, for alln € Zt.
                       Proof (By the Principle of Mathematical Induction): If » = 1 the conjecture provides
                             hey?                = (-)F dy? = 1 = (2!)                      = (HD! 3921, i, which is a true
                       statement. And this establishes the basis step of the proof. To confirm the inductive step, we
                       shall assume the truth of the result

k                       k
                                                                         Yep?          — (—1)**! Si
                                                                         7=]                         :=1

for some k > 1. Whenn               = k + 1 we find that

k+1                          k                                                              k
                       Lev?                 _    (Sen)                     + (-1)&tD+1 (K + 1? _ (-1)*+! S-              i+   (—1)**? (k 4 1)?
                       i=l                         7=1                                                            i=]

= (1)  1 (AYR + 1/2 + (HPP (K+: 1)? = (HD +: 1)? — CK + 1/2]
                                            = (-1)7(1/2)[2(k + 1)? — k(K + 1) = (H 1)? (1/2) 2k? + 4k + 2 — k? — ki]
                                            = (—1)°7(1/2)[k? + 3k + 2] = (- DP (1/2) (k + DK +2)
                                                           k+]
                                            =    (—1)**?    > i,

1=1

so the truth of the result at 7 = k implies the truth at n = k + 1 —and we have the inductive
                       step. It then follows by the Principle of Mathematical Induction that

epee        _   (—1)""!   ~     i,

r=1                         i=l

for alln e Zt.
                     .a)n              n+n+41                        n         n?>+n+41              n          n+n+4i1
                             1              43                       4            61                 7             97
                             2              47                       5            71                 8             113
                             3              53                       6            83                 9             131
                       b) Forn = 39, n? +n-+41 = 1601, a prime. But forn = 40, n? +n +41 = (41)’, so
                       S(39) #         S(40).
                     . a) Forn = 0, 27+! + 1 =24 1 =3, so the result is true in this first case. Assuming that 3
                       divides 27*+! +. 1 forn = k (> 0) EN, consider the case of n = k + 1. Since 274+9+!1 4.1] =
                       2743 4 | = 4(27K+') 4 | = 4274+! 4 1) — 3, and 3 divides both 2+!                                 + 1 and 3, it follows
                       that 3 divides 27+"! + 1, Consequently, the result is true for n = k + 1 whenever it is true for
                      n = k. So by the Principle of Mathematical Induction, the result follows for all n €N.
                    9 x =y=z=Oandx =2,y=5,z=5
$-26   Solutions

11.   For n = 2 we find that 2” = 4 < 6 = (3) < 16 = 4’, so the (open) statement is true in this first
                         case. Assuming the result true form = k > 2 —that is, 2* < GY) < 4*, we now consider what
                         happens for n = k + |. Here we find that
                                   2k+1)\        _ (2k +2\ _ [2k +2)QK+1))                 (2k)     _                       2k
                                 Ty          )-Ca)-[                 (kK+DK+)        1@)                210k + D/G+ DIC)
                                                 > 2[(2k + 1)/(k + I)]2k > 2**1,
                         since (2k + 1)/(k +1) = [((K +1) +k]/(kK + 1) > 1. In addition, [(k + 1) +k]/(k +1) < 2, s0
                         Cee7) = 202k + D/(k + DICE) < (2)(2) C4) < 4**!. Consequently, the result is true for all
                       n > 2, by the Principle of Mathematical Induction.
                   13. First we observe that the result is true for all n € Z* where 64 <n < 68. This follows from the
                         calculations

64 = 2(17) + 6(5)      65=13(5)                  66 = 3(17) +35)
                                                 67=1(17)+10(5)         68 = 4(17)
                       Now assume the result is true for all n where 68 <n < k, and consider the integer k + 1. Then
                       k+1= (k —4) +5, and since 64 < k —4 <k, wecan write k — 4 = a(17) + b(5) for some
                       a, b €N. Consequently, k + 1 = a(17) + (b + 1)(5), and the result follows for all n > 64, by
                       the alternative form of the Principle of Mathematical Induction.
                   15, a)r =rotr,-10+m-10°+---+r,-                 10"
                              =rt+rj(9 +r, 472199 tm+---4+r, (99...9)N4+nr%
                                                                          —
                                                                                   n 9's
                                   = [97 $9972 + e+ +99- + ral + (ro tr Er
                                                                         be + In)
                         Hence 9|r if and only if 9|(% +7; +ro+---4+7,).
                         c) 3\¢ forx = 1 or4 or7; 9|t forx = 7.
                   17.
                         a) (5)         by (4)
                   19. a) 1,4,9        b) 1,4,9, 16,..., k, where k is the largest square less than or equal to n.
                   21. a) Forallne      Zt, n>3,142434---+n=n(n41)/2.                   If {1,    2,3,..., n}= AUB with
                       Sa = Sg, then 2s, = n(n + 1)/2, or 4s4 = n(n 4+ 1). Since 4|n(n + 1) and ged(n, n +1) = 1,
                       then either 4|” or 4\(n + 1).
                       b) Here we are verifying the converse of our result in part (a).
                            (i) If 4|n, we write n = 4k. Here we have
                                {1,2,3,...,k,k41,..., 3k, 3k+1,..., 4k} = AUB where A = {1, 2, 3,...,k,
                                3k+1,3k+2,...,4k         —1, 4k} and B = {k+1,k +2,..., 2k, 2k +1, 3k — 1, 3k},
                                    with sg =     (1+2+3+---+k)+[GK        +1) + 3K +2) 4+---+G3kK4+4)] =
                                     [K(k + 1)/2] +k (3k) + [k(k + 1)/2] = kk +1) + 3k? = 4k? +k, and
                                     Sp=[K+1)4+(K4+2)4+---+(K+4))4+           (02k + 1) 4+ (2k +2) 4+---4+ (2K +4)]
                                    = k(k) + [k(k + 1)/2] + k(2k) + [K(k 4+ 1)/2] = 3k? +k(K +1) = 4k? +k.
                            (ii)    Now we consider the case where n + 1 = 4k. Then n = 4k — 1 and we have
                                    {1,2,3,...,k —l,k,...,3k —1, 3k,..., 4k —2,4k —1} = AUB, with
                                    A={1,2,3,...,k       —1, 3k, 3k +1,...,4k — 1} and
                                    B={k,k+1,...,2k         —1, 2k, 2k+1,...,3k — 1}. Here we find
                                    Sa =(14+2434---4+           -—D)] 4134 + Gk4+D4+---4+ BkK4+(k-1))] =
                                    [(k — 1)(k)/2] + k(3k) + [(k — 1)(k)/2] = 3k? + kh? — k = 4k? — k, and
                                    Sp=[K+R+I)+---++&—-—D)14+2K4+                   2K +1) 4+---+ 02k +      —1))]
                                    =k? + [(k — 1)(k)/2] + k(2k) + [(k — 1) (k)/2] = 3k? + (k= Ik = 4k? — k.
                   23. a) The result is true fora = 1, so considera > 1. From the Fundamental Theorem of Arithmetic
                         we can write a = p\'p;’--- p;', where pi, p2...., p; are f distinct primes and e, > 0, for all
                         1 <i <1. Since a?|b? it follows that p>“ |b? for all 1 <i <1. Sob? = pi" pi”... p7fec?,
                         where f, > e, forall] <i <r,andb=p!'p®... piic= a(pi'* py...                               pj “')c, where
                         f, —e      > 0 for all 1 <i <+. Consequently, a|b.
                         b) This result is not necessarily true! Let a = 8 and b = 4. Then a? (= 64) divides b’ (= 64),
                         but a does not divide b.
                                                                                                 Solutions   §-27

25. a)      Recall that

a+b             =(a+b)(a* —ab +b’)
                                    a+b             =(a+b)(a*       -—ab+a*h’? — ab’ +b’)

a? + bP = (a +b)(a?-! — a? 2b +--+. + bP)
                                                                P

= (a+b) ) oa?" (-by,
                                                            i=]

for p an odd prime.
            Since k is not a power of2 we write k = r - p, where p is an odd prime and r > |. Then
       a* + b= (a")? + (b')? = (a +b) YO" at (—b'),, so a* + b* is composite.
       b) Here x is not a power of 2. If, in addition, 7 is not prime, then n = r - p where p is an odd
       prime. Then 2? +1 = 2" 4+ 1" =27-P 417? = (27 41) OP rnp                            =
       (27° +1) 92?) (- 17 !2"?-, so 2” + 1 is composite
                                                     — not prime.
27. Proof: For n = 0 we find that Fy = 0 < 1 = (5/3)°, and forn = 1 we have F; = 1 < (5/3) =
    (5/3)'. Consequently, the given property is true in these first two cases (and this provides the
    basis step of the proof).
        Assuming that this property is true forn = 0,1,2,...,k —1,k, where k > 1, we now
    examine what happens at n = k + 1. Here we find that

Frat = Fe + Fra < (5/3) + (5/3)*"! = (5/3)! [(5/3) + 1] = (5/3) 18/3)
                          = (5/3)5"' (24/9) < (5/3)*1(25/9) = (5/3)              "(5/3)" = 5/3)".
       It then follows from the alternative form of the Principle of Mathematical Induction that
        F,, < (5/3)" for alln EN.
29,    a)   There are 9 - 10-10          = 900 such palindromes and their sum is
          2     ean   ear abcba =     2     er    ?_,(10001a + 10104 + 100c) =
          Po > F-9110(10001a + 10106) + 100(9 - 10/2)] =
          an > p=9(100010a + 10100b + 4500) =       ° _,[10(100010a) + 1010009 - 10/2) +
       10(4500)] = 1000100     ean a + 9(454500) + 9(45000) = 1000100(9 - 10/2) +
       4090500 + 405000 = 49,500,000.
       b) begin
               sum :=0
               for   a:=1to9do
                   for b     :=0        to   3 do
                       forc       :=0 to9do
                            sum    :=    Ssum+10001*        a+1010*           b+100%*c
               print       sum
            end

31. Proof: Suppose that 7|n. We see that 7|n => 7|(n — 21u) > 7\[(n — u) — 20u] >
    7\[10(45*) — 20u] => 7|[10(5* — 2u)] > 7|(44¢ — 24), by Lemma 4.2 since ged(7, 10) = 1.
    [Note: 7 € Z* since the units digit of n — u is 0.] Conversely, if 7\(45° — 2u), then since
       at — 2u = *2              we find that 7|(4>*) => 7-10-x =n — 21u, for some x € Z*. Since 7|7
    and 7|21, it then follows that 7|” —by part (e) of Theorem 4.3.
33. If Catrina’s selection includes any of 0, 2, 4, 6, 8, then at least two of the resulting three-digit
    integers will have an even unit’s digit, and be even    — hence, not prime. Should her selection
    include 5, then two of the resulting three-digit integers will have 5 as their unit’s digit; these
    three-digit integers are then divisible by 5 and so, they are not prime. Consequently, to
    complete the proof we need to consider the four selections of size 3 that Catrina can make from
    {1, 3, 7, 9}. The following provides the selections — each with a three-digit integer that is not
    prime.

1) {1, 3, 7}: 713 = 23-31                                      2) {1, 3, 9}: 913 = 11-83
5-28         Solutions

3) {1, 7, 9}:917=7-
                                                         131                            4) {3, 7, 9}: 793 = 13 - 61

35. Let x denote the integer Barbara erased. The sum of the integers 1, 2,3,...,x —1l,x +1,
                                    xX +2,..., nis [n(n + 1)/2] — x, so [[n(v + 1)/2] — x]/(@ -— 1) = 3555. Consequently,
                                    [n(n + 1)/2]) —x = (352)(n — 1) = (602/17)(n — 1). Since [n(n + 1)/2] — x € Z*, it
                                    follows that (602/17)(n — 1) € Z*. Therefore, from Lemma 4.2, we find that 17|(n — 1)
                                    because 17/ 602. Forn = 1, 18, 35, 52 we have:
                                        n   x =[n(n  + 1)/2) — (602/17)(     - 1)
                                          l                       1
                                       18                    —431
                                       35                    —574
                                       52                    —428
                                        When n = 69, we find that x = 7 [and (3°, i ~ 7)/ 68 = 602/17 = 352].
                                        For n = 69 + 17k, k > 1, we have
                                                            xX = [(69 + 17k)(70+ 17k) /2] — (602/17)[68 + 17k]
                                                              = 7+ (k/2)[1159 + 289k]
                                                              = [7 + (1159k/2)] + (289k?)/2 > n.

Hence the answer is unique: namely, n = 69 and x = 7.
                                 37.    (1 +m,)(1+m2)(1 +m), where m, = min{e,, f,} for] <i <3.

Chapter 5
                     Relations and Functions

Section 5.1—p. 252
                                  1. AX B= {C1, 2), (2, 2), (, 2), (4, 2), C. 5), (2,5), G3, 5), (4, 5)}
                                     BX A= {(2, 1), (2, 2), (2, 3), (2, 4), (5, 1), (5, 2), (5, 3), (5, 9}
                                     AU(B XC) = {1, 2,3, 4, (2, 3), (2, 4), (2, 7), 6. 3), 6, 4, 6, 7)}
                                     (AUB) xC = {(1, 3), (2, 3), GB, 3), 4, 3), 6.3), 1. 9, 2.9. GB. 4, 4 9, 6, 4).
                                                   (1, 7), (2, 7), 3, 7), (4,7), 6, 7)} = (A X C)U(B XC)
                                  3a)9       bP         PF adr 9 Q H C)+G)+C)
                                  5. a) Assume that A X BCC X Dandleta € Aandbe B. Then (a, b) € A X B, and since
                                        AX   BCC X D, we have     (a, b) EC    X D.   But (a, b)eE C X D=aeCandbe
                                                                                                                D.               Hence
                                        ae€ASDaeEC,SOACC,          andbe      BSbeED,     soBCD.
                                           Conversely, suppose that A C C and B C D, and that (x, y) € A X B. Then
                                        (VIE AXB>xe€Aandye                 B=>x eC (since ACC) and ye D (since B CD)
                                        => (x, y) €C X D. Consequently, A X BCC X D.
                                        b) Even if any of the sets A, B, C, D is empty, we still find that

(A SCC)A(BOD)|>
                                                                          [AX BOCX DI.
                                        However, the converse need not hold. For example, let A = @, B = {1, 2}, C = {1, 2}, and
                                        D = {1}. Then A x B = 4 —if not, there exists an ordered pair (x, y) in A X B, and this
                                        means that the empty set A contains an elementx. Andso      A X B =#@CC       XK D—but
                                     B={1,2})Z   {1} =D.
                                   . a) 2”   —b) If |A| =m, |B| =n, form, n EN, then there are 2”” elements in P(A X B).
                                  —

9c) (X, VIE(ANB)XCHexecANBandyeC                    (xe Aandxe ByandyeCes
                                     (xe AandyeCjand(xe BandyeC)S.yeAXCandtxy, yeBxeces
                                        (x,y)E(AXC)N(BXC)
                                 iL.  Gx. ye AX(B-C)SxeAandyeB-Ce>xecAand(ye         Bandy€C)<—
                                     (x€Aandye Byand(x ce Aandy~C) Sx, ye AX Bandtix, vyy€AXCS
                                     (x, y)€ (AX B)-(AXC)
                                 13. a) (1) (0, 2) ER; and
                                          (2) If (a. b) ER, then (a+1,b4+5) ER
                                                                                                             Solutions      §-29

b) From part (1) of the definition we have (0, 2) € &. By part (2) of the definition we then find
                         that
                                Gi) (0,2€R3BO04+1,24+5) =C,7 €R;
                               Gi) 1, 7)€RBIA4+1,74+5) =2, 12 ER:
                              (iii) (2, 12,E RS (241, 124+ 5) = GB, 17) € RK; and
                              (iv) 3, 1INER> B41, 17 +5) = 4, 22) ER.
Section 5.2—p. 258
                       . a)   Function; range = {7, 8, 11, 16, 23,...}    —_b) Relation, not a function
                         c)   Function; range = R_     d) ande) Relation, not a function
                       » a) (1) {C, x), @, x), G. x), 4,0)                    (2) {, y), 2. y), By), Av}
                            (3) {C, 2), 2, 2), 3.2). A 2)                     (4) 10, x), 2. y), 3, x), A)
                            (5) {(. x), (2, y), GB. 2), 4, x}
                         b) 34       ce) O      §=6d) 4   oe) 24S   ff) 33s      gs) 3?_~—Ss   hh) 3?
                       - a) {1,3)}           bb) {(—7/2, —21/2)}
                         e) ((-8,-15)}            d) R? — ((-7/2, ~21/2)} = {(x, y)lx # 7/2 or y # ~21/2}
                       . a) (23~—1.6]=[0.7)=0      b) [23J—[1.6)=2-1=1
                          c) [3.4]|6.2]}=4-6=24   d) [3.4)[6.2] =3-7=21
                          e) [27] =6    f) 2[x]=8
                       . a) ---U[—1, —6/7) U[0, 1/7) U[], 8/7)
                                                            U [2, 15/7) U---
                         b)   [1,8/7)   oe) Z  DR
                     11. a)   ---U(—7/3, —2] U (-4/3, -—1] U (- 1/3, 0] U (2/3, 1] UV (S/3, 2] U--+- =
                          U nex — 1/3, m]
                         b)   ---U((—2n — 1)/n, —2] U (-—n — 1)/n, -1) U(—1/n, 0) U (mm — 1)/n, 1] U
                         ((2n —1)/n, 2]U---= Unez(m — 1/n, m]
                     13. a) Proof (i): If ae Z*, then [a] = a and [[a]/a] = [1] =1l.Ifa¢Z*, writea=n-+c,
                         wheren € Z* andQ <c < 1, Then [a]/a = (n+ 1)/(# +c) =1+(1—c)/(# +0), where
                         0< (1 —c)/(n +c) < 1. Hence [[a]/a] = |1+0 -—c)/mst+o)] =1.
                         b) Consider a = 0.1. Then
                                 (i) [[a]/a] = |0/0.1] = [10] = 10 A 1; and
                              (ii) [la]/a] = [0/0.1] = 041.
                              In fact (ii) is false for all O < a < 1, since [|a@]/a] = 0 for all such values of a. In the case of
                         (i), when 0 < a < 0.5, it follows that [a]/a > 2 and |[a]/a] > 2 # 1. However, for
                         0.5 <a<1, fa|/a = |/a where 1 < 1/a <2, andso |[a]/a] =1for0.5<a< 1.
                     15. a) One-to-one; the range is the set of all odd integers.
                         b) One-to-one; the range is Q.
                         c) Not one-to-one; the range is {0, + 6, + 24,+ 60,...} = {n> —n|n €Z}.
                         d) One-to-one; the range is (0, +00).
                         e) One-to-one; the range is [—1, 1].
                         f) Not one-to-one; the range is [0, 1].
                     17, 4?
                     19, a) f(A, UA2) = {y € Bly = f(x), x € Ay U An} = fy € Bly = f(x), x € A) or x € An} =
                         {ye Bly = f@), xe A}U {ye Bly = f(x), x © Ar} = f(A1) U f(A2)
                         ¢) From part (b), f(A; 9 Az) © f(A,) M f(A2). Conversely, y € f(A1) N f(A2) > vy =
                         f(x) = f(x), for x; € Ay, x2 € Ao => y = f (x1) and x, = x2 (because f is injective)
                         => ye f(A, 1 Az). So f injective > f(A; N Ar) = f(A,) 9 f(A).
                     21. No. Let A = {1, 2}, X = {1}, Y = {2}, B = {3}. Forf = {(1, 3), (2, 3)} we have f\x, fly
                         one-to-one, but f is not one-to-one.
                     23. a) fq) =12¢-l)+f                     db) fa@,)=10G-D)+j                        © f@)=7-D+i
                     25. a) i) f(a) =nG -D+k-D+j                         Giga, =mG-D+k-)+i
                         b) K+(mn—-1)     <r
                     27. a) AQ, 3) = A(O, AC, 2)) = AC, 2) + 1 = A(O, AU, 12) +1 = [Ad, 1) 4+1)4+1=
                         A(1, 1) +2 = A(O, AC, 0)) +2 = [AC 0) +1] 4+ 2 = ACL,      0) +3 = AO, 1) +3 =
                         (4+1)4+3=5
S-30         Solutions

A(2, 3) = A(I, A(2, 2))
                                             A(2, 2) = AC, A(2, 1))
                                             A(2, 1) = AC, A(2, 0)) = ACL, AC, 1))
                                             AC, 1) = AQ, AC, 0)) = AM, 0) +1 = AO, 1 4+1=04141=3
                                             A(2, 1) = AC, 3) = A(O, A(1, 2)) = AC, 2) +1 = A(O, ACL, 1)) +1
                                                     ={AU,D+1]4+1=5

A(2, 2) = A(1, 5) = A(O, AC, 4)) = AC, 4) + 1 = AO, AC, 3)) +1                         = ACI, 3) +2
                                              = A(O, AC, 2)) +2 = AC, 2) +3 = AQ, AC,                  1) +3      =A,       1 +4=7
                                    A(2, 3) = AQ, 7) = A(O, ACI, 6)) = AC, 6) + 1 = ACO, A,                      5)) +1
                                              = A(0,7)4+1=(74+1)4+1=9

b) Since A(1, 0) = A(O, 1) = 2 = 0 +2, the result holds for the case where n = 0. Assuming the
                         truth of the (open) statement for some k (> 0), we have A(1, k) = k + 2. Then we find that
                         AQ, k +1) = A(O, AG, k)) = AC, &k) +1 = (K +2) +1 = (+1) +2, so the truth atn =k
                         implies the truth at n = k + 1. Consequently, A(1, 7) = n + 2 for all n € N by the Principle of
                         Mathematical Induction.

Section 5.3—p. 265
                          1. a) A= {1, 2,3, 4}, B= {v, w, x, y, z}, f = {C1 v), (2, v), GB. wy), (4, x)}
                             b) A, Basin (a), f = {C1 v), 2, x), GB. 2), (4, y)}
                             c) A={1,2,3,4,5}, B= {w, x,y,z}, f = (d, w), (2, w), (3, x), 4, y), (5, 2)}
                             d) A= {1, 2,3, 4}, B= {w, x,y,z}, f= {, w), 2, x), (3, y), (4, 2}
                          3. a), b), c), and f) are one-to-one and onto.
                             d)   Neither one-to-one nor onto; range = [0, +00)
                             e) Neither one-to-one nor onto; range = [—4, +00)
                          5. (For the case n = 5, m = 3):

5              5                        5                     5                    5
                                        d( _4)k
                                            1) (,°,)e          — py3
                                                                  ky? == (-1)
                                                                          ¢_1)0
                                                                              (2)s 3 +(-1
                                                                                       —aylfr
                                                                                            (3)4 )43 + _4)2
                                                                                                         (=1) (3)3 3

_43{>)53             —4fP\43                  asf? \o3
                                                                         +(-1) (3)             +(-1)       (7)     +(—1)         (30

= 125 — 5(64) + 10(27) — 10(8) +5 = 0

7. a)     (i) 2!S(7, 2)           (ii) (3)[2!S(7, 2)]   (iii) 3!S(7, 3)
                                  (iv) (3)[3!SC, 3)]        (v) 4!S(7, 4)         (vi) ({)[4!S(7, 4)]
                             b) ()[KIS(m, k)]
                          9. For each r € R there is at least one a € R such that a> — 2a* +a — r = 0, because the
                             polynomial x° — 2x* +x — r has odd degree and real coefficients. Consequently, f is onto.
                             However, f (0) = 0 = f(1), so f is not one-to-one.

LON]            10203                     4         5             6               7        $8       9         10
                                    9    1      255     3025     7770       6951         2646           462        36         ]
                                  10     1      511     9330    34105      42525        22827          5880       750       45 =        1

13. a) Since 156,009 = 3 X 7 X 17 X 19 X 23, it follows that there are $(5, 2) = 15 two-factor
                             unordered factorizations of 156,009, where each factor is greater than 1.
                             b) )09_, $(5,7) = 154254+104+1=51                       ©) YL, Sam, i)
                                                                                                             Solutions        S-31

15.       ayn=4:>°4 iS(4,1); 2=5: >,                       86,7)
                               In general, the answer is )°"_, i!S(n, i).
                               b) (3)      S012, i812, 2).
                     17.       Let a@|, @2,..., dm, x denote the m + 1 distinct objects. Then S,(m + 1, 2) counts the number
                               of ways these objects can be distributed among n identical containers so that each container
                               receives at least r of the objects.
                                   Each of these distributions falls into exactly one of two categories:
                               (1) The element x is in a container with r or more other objects: Here we start with S,(m, 1)
                               distributions of a), a2, ..., @, into n identical containers   — each container receiving at least r
                               of the objects. Now we have n distinct containers     — distinguished by their contents.
                               Consequently, there are n choices for locating the object x. As a result, this category provides
                               nS,(m, n) of the distributions.
                               (2) The element x is in a container with r — | of the other objects: These other r — 1 objects can
                               be chosen in (1) ways, and then these objects — along with x —can be placed in one of the n
                               containers. The remaining m + 1 — r distinct objects can then be distributed among the n — 1
                               identical containers — where each container receives at least r of the objects — in
                               S,(m + 1— rr,       — 1) ways. Hence this category provides the remaining
                               (,",)S,Qm + 1 —r, n — 1) distributions.
                     19. a) We know that s(m, n) counts the number of ways we can place m people —call them
                         P\. Pi. +++» Pm—around n circular tables, with at least one occupant at each table. These
                         arrangements fall into two disjoint sets: (1) The arrangements where p, is alone: There are
                         s(m — 1, n — 1) such arrangements; and (2) The arrangements where p, shares a table with at
                         least one of the other m — | people: There are s(m — 1, 2) ways where p2, p3,..., Pm Can be
                         seated around the n tables so that every table is occupied. Each such arrangement determines a
                         total of m — | locations (at all the n tables) where p; can now be seated— this for a total of
                         (m — 1)s(Qm — 1, n) arrangements. Consequently, s(m, n) = (m — 1)s(m — 1,2) 4+
                         s(m—1,n—-—1),form>n> 1.

Section 5.4-p. 272
                           . Here we find, for example, that f( f(a, 6), c) = f(a, c) = c, while f(a, f(b, c)) =
                             f(a, b) =a, so f is not associative.
                           . a), b), and d) are commutative and associative; c) is neither commutative nor associative.
                       tn Go

.a) 25.      by 5%        cc) 5%    dy 59°
                      “J

. a) Yes       b) Yes        c) No     9. a) 1216       b) p?!qg*””
                           . By the Well-Ordering Principle, A has a least element and this same element is the identity for
                      mi
                     —_

g. If A is finite, then A will have a largest element, and this same element will be the identity
                             for f. If A is infinite, then f cannot have an identity.
                     13.       a)   5   b)   A3   Ag    As      c)   A},   Az
                                             25   25      6
                                             25     2    4
                                             60   40    20
                                             25   40    10
Section 5.5—p. 277
                          1. The pigeons are the socks; the pigeonholes are the colors.         3. 26° + 1 = 677
                          5. a) Foreachx € {1, 2, 3,..., 300} writex = 2” -m, where n > 0 and gcd(2, m) = 1. There
                             are 150 possibilities for m: 1, 3,5, ..., 299. When we select 151 numbers from
                             {1, 2, 3,..., 300}, there must be two numbers of the form x = 2°-m, y = 2'- m.Ifx <y,
                             then x|y; otherwise y < x and y|x.
                             b) If + 1 integers are selected from the set {1, 2, 3,..., 2n}, then there must be two integers
                             x, y in the selection where x|y or y|x.
                          7. a) Here the pigeons are the integers 1, 2,3, ..., 25 and the pigeonholes are the 13 sets
                             {1, 25}, {2, 24}, .... {11, 15}, {12, 14}, {13}. In selecting 14 integers, we get the elements in at
                             least one two-element subset, and these sum to 26,
S-32         Solutions

b) If S ={1,2,3,..., 2n + 1}, for n a positive integer, then any subset of size n + 2 from $
                              must contain two elements that sum to 2n + 2.
                            . a) Foreachr é {1, 2,3, ..., 100}, we find that 1 < ./7 < 10. When we select 11 elements
                                 from {1, 2,3,..., 100} there must be two— say, x and y — where |./x] = L,/y] so that
                                 0<|J/x— J/yl <l.
                                 b) Letn € Z*. Ifn + 1 elements are selected from {1, 2, 3, ..., 7}, then there exist
                                 two— say, x and y — where 0 < |./x — //y| < 1.
                         11. Divide the interior of the square into four smaller congruent squares as shown in the figure.
                             Each smaller square has diagonal length 1//2. Let region R, be the interior of square AEKH
                             together with the points on segment EX, excluding point E. Region R; is the interior of square
                             EBFK    together with the points on segment FK, excluding points F and K. Regions R3 and R,
                             are defined in a similar way. Then if five points are chosen in the interior of square ABCD, at
                             least two are in R, for some 1 <i < 4, and these points are within 1 / 2 (units) of each other.

E        B
                         7       @---——_e>

|
                                 )

G)

ry

13. Consider the subsets A of S where 1 < |A| < 3. Since |S| = 5, there are (7) + (3) + G) = 25
                             such subsets A. Let s4 denote the sum of the elements in A. Then 1 < s, <7+8+9 = 24. So
                             by the pigeonhole principle, there are two subsets of S whose elements yield the same sum.
                         15, For (6 A)T CS, we have 1 <sp <m+(m-—1)+--+-+(m—6) = 7m — 21. The set S has
                             2’ — 1 = 128 — 1 = 127 nonempty subsets. So by the pigeonhole principle we need to have
                             127 > 7m — 21 or 148 > 7m. Hence 7 <m < 21.
                         17. a) 2,4,1,3       b) 3,6,9,2,5,8,1,4,7
                                 c)          For n   > 2, there exists a sequence of n? distinct real numbers with no decreasing or
                                 increasing subsequence of length n + 1. For example, consider n, 2n, 3n,..., (n — In,
                                 n?, (n—1),   (Qn —1),..., @? — 1), (n   — 2), Qn —2),...,      * —2),..., 1,     +1),
                                 (Q2n4+1),...,(@—1)n+1.
                                 d) The result in Example 5.49 (for n > 2) is best possible —in the sense that we cannot reduce
                                 the length of the sequence from n? + | ton’ and still obtain the desired subsequence of length
                             n+],
                         19, Proof: If not, each pigeonhole contains at most & pigeons    — for a total of at most kn pigeons.
                             But we have kn + 1 pigeons. So we have a contradiction and the result then follows.
                         21, a) 1001        —_—b) 2001
                             c) Letn, k € Z*. The smallest value for |S| (where S C Z*) so that there exist n elements
                             Xi, X2,..., X%, € S where all n of these integers have the same remainder upon division by & is
                             k(n —1) +1.
                         23. Proof : If not, then the number of pigeons roosting in the first pigeonhole is x; < p, — 1, the
                             number of pigeons roosting in the second pigeonhole is x. < p)2 — 1,..., and the number
                             roosting in the mth pigeonhole is x, < p, — 1. Hence the total number of pigeons is
                                 Xi +x   te ten = (pi — D+ (pp — 1) +--+ + rn — I= pit Pate               + Pr oN <
                                 Pi + pr t++>+ pr —n +1, the number of pigeons we started with. The result now follows
                                 because of this contradiction.

Section 5.6—p. 288
                             . a)            7!—6!= 4320          b) nt-—(n—-lD!t=™m—-—)Dm-))!
                         uo —_

.a=3,b=-l:a=-3,b=2
                                                                                                                                           Solutions          $-33

5. g°(A) = e(TN(SUA))                            =TOA(SU[TN(SUA)))
              =TN[SUT)N(SU(SU
                     A))] = TAL(SUT)N(SUA)]
              =[TNO(SUT)IN(SUA)=TNO(S
                                UA) = g(A)
7. a) (f og)(x) = 3x —1; (go f)(x) = 3(a — 1);
                                  0,        x even;                                         _      |0,        x even;
           women          ={}               x odd                   hoon               =|t                    x odd

(f 0(g oh))(x) = f(g oh)(x)) = 1
                                                                                  1,            x even;
                                                                                                x odd
                                       _         |(fog)(0),               xeven             _|-l,               x even
           (CFosvoMO= Tees nay.                                           x odd -|                       2,     x odd
      b) f2(x) = f(f()) = x — 2; A(x) = x — 3: 27 (x) = 9x: (x) = 27x?                                                                           =       Hh   =A.
9. a) f(x) = (1/2)(Inx —5)
      b) Forx eR‘,
               (f     oO f ')@)             _—    f(/2)dn             x   —5))         _—       e2((1/2) (ln x5)      +5   —   eittosts5     =   eln*   =x.

Forx €R,

(fol o fy) = fe)                                       = (1/2) [In(e**) — 5] = (1/2)[2x + 5 — 5] = x.

y
                                   f(x)

(0, e° )           Ve                )

>   xX

[        0)

11.   f, g invertible = each of f, g is both one-to-one and onto = go f is one-to-one and onto
      => go f invertible. Since (g o f)o(f-!'og') = 1c and(f-!og')o(go f) = Ia, it
      follows that f~! o g~! is an inverse of g o f. By uniqueness of inverses, we have
      flog'=(gof)".
13. a) f~'(—10) = {—17}                                          f-'(0) = {-7, 5/2}
       f'(4) = {-3,1/2,5}                                        fF 1) = {-1,7}
           £7") = {0, 8)                                         f-'(8) = {9)
      b)     (i) [-12, -8]                                         (ii) [-12, -7] U [5/2, 3)
           (ii) [-9, -3]U[1/2, 5]                                 Gv) (2, 01U 6, 11)
            (v) [12, 18)
S-34         Solutions

15. 3° . 43 = 576 functions
                         17. a) The range off = {2,3,4,...} =Z* — {1}.
                                 b) Since | is not in the range of f, the function is not onto.
                                 c) Forallx, ye Z*, f(x) = fo >x+1l=y+1>5%x=~y,s0                      f is one-to-one.
                                 d) The range of gis Z*.     e) Since g(Z*) = Z", the codomain of g, this function is onto.
                                 f) Here g(1) = 1 = g(2), and | # 2, so g is not one-to-one.
                                 g) Forallx eZ", (go f)(x) = g(f(x)) = g(x +1) = max{l,                29 +1) —-      =
                                 max{1, x} = x, since x € Z*. Hence g 0 f = Ig+.
                                 h) (f og)(2) = f(max{1, 1}) = f() =14+1=2
                                    (f © g)(3) = f(max{1, 2}) = f2)=2+1=3
                                      (fo g)(4) = f(max{1, 3}) = fG)=3+1=4
                                      (fo g)(7) = f(max{1, 6}) = f(6)=64+1=7
                                     (f og)(12) = f(max{l, 11}) = fal) =114+1= 12
                                     (f o g)(25) = f(max{1, 24}) = f(24) = 244 1=25
                                 i) No, because the functions f, g are not inverses of each other. The calculations in part (h)
                                 may suggest that f o g = lz+, since (f o g)(x) = x forx > 2. But we also find that
                                  (f og)(1) = famax{i, 0}) = fC) = 2,s0 (f o g)(1) # 1, and, consequently, f og # 1z+.
                         19, a) ae f (BIN    Bn) Ss fla € BN Bs =                 f(a) « Band f(a) € By            SB ae f-'(B,) and
                                 ae f'(B) sae f'(B)O f (Bo)
                                 ce) ae fIBJSf@MeBSf@eB                            Sag     f'(B) sae f'(B)
                         21.     a) Suppose that x,;, x. € Zand f(x,) = f(x). Then either f(x,), f(x2) are both even or they
                                 are both odd. If they are both even, then f(x) = f(x?) = —2x, = —2x2 => x) = Xp.
                                 Otherwise, f(x,), f(x2) are both odd and f(x,) = f(x2) > 2x, — 1 = 2x, —1 > 2x, =
                                 2x7 => xX; = X2. Consequently, the function f is one-to-one.
                                     To prove that f is an onto function, let n € N. Ifn is even, then (—n/2) € Z and (—n/2) <0,
                                 and f(—n/2) = —2(—n/2) =n. For the case where n is odd we find that (n + 1)/2 € Z and
                                 (n + 1)/2 > 0, and f((m + 1)/2) = 2[(7 + 1)/2]-— 1 = (n+ 1) -—1 =n. Hence f is onto.
                                 b) f—':N— Z, where

r=      {Oe
                                                          -t¢y -[G)@+D.

23. a) Foralln EN, (g0 f)(n) = (ho f)(n) = (ko f)(n) = 1.
                             b) The results in part (a) do not contradict Theorem 5.7. For although
                             gof=hof=kof = In, we note that
                                       (i) (f og)() = f(L1/3)) = £0) =3-0=0 1,80 fog F In;
                                      (i) (f oh)(1) = f((2/3)) = FO) =3-0=0 £ 1,80 f oh # Ly; and
                                      (Gil) (f ok)C) = f(13/3})) = fC) =3-1=3 4 1,so fok # In.
                                 Consequently, none of g, #, and k is the inverse of f. (After all, since f is not onto, it is not
                                 invertible.)

Section 5.7—p. 293
                               -a) feO(m)          b) feO)           oo) fed)           dd feo’)
                                 e) feO(n’)         £f) fed’)           g) fe Om)
                               . a)   Foralln € Z*, 0 <log,n     <n. Soletk = 1 and m = 200 in Definition 5.23. Then
                                 | f (n)| = 100 log, n = 200 (5 log, n) < 200 ($n) = 200|g(n)|, so f € O(g).
                                 b)   Forn   = 6, 2" = 64 < 3096 = 4096 — 1000 = 2'* — 1000 = 2"         — 1000. Assuming that
                                 2‘ < 2%* — 1000 forn = k > 6, we find that 2 < 2? => 2(2*)       < 2?(2%* — 1000)     < 272%   — 1000,
                                 or 24*! < 274+) — 1000, so f(n) < g(n) for all n > 6. Therefore, with k = 6 and m = 1 in
                                 Definition 5.23, we find that for n > k, | f(n)| < m|g(n)| and f € O(g).
                               . To show that f € O(g), letk = 1 and m = 4 in Definition 5.23. Then for all n > k, | f(n)| =
                                 mta<n?+n®          =2n? <2n3 = 4((1/2)n) = 4|g(n)|, and f is dominated by g. To show that
                                 g ¢ O(f), we follow the idea given in Example 5.66, namely, that

VmeR*      VkeEZ      AneZ       [m=k)
                                                                                      a (\g(n)| > mf (~)))).
                                                                                                      Solutions          §-35

So no matter what the values of m and k are, choose n > max{4m, k}. Then
                         Ig(n)| = (4) nF > (5) 4m)n? = mn’) > mr? +n) = mf (n)|. so g ¢ Of). Alternatively,
                        ifg € O(f), then dm e R* SkEZ* VneZ* |(3) n°| <m|n? +n], or (5) nn? <m@t I).
                        Then =,      <m>0< ae < —             <m=>5<m,a         contradiction since    is variable and m
                        constant.
                       . Foralln > 1, log, n <n, so with k = 1 and m = | in Definition 5.23, we have |g(n)| =
                         log,n<n=m-n=m|f(n)|. Hence g € O(f). To show that f ¢ O(g), we first observe that
                         liMy-+ 20 logy = +oo. (This can be established by using L’ Hospital’s Rule from the calculus.)
                         Since lim, _.o3 on    = --oo, we find that for every m € R* andk € Z*, there in ann € Z* such
                        that      —"— > m, or|f(n)| =n > mlog, n = m|g(n)|. Hence f ¢ O(g).
                               logy n
                       . Since f € O(g), there exists m € R*, k € Z* such that | f(7)| < m|g(n)| for all n > k. But then
                         | f(n)| < [m/|cl]|cg(n)| for all n > k, so f € O(cg).
                     11. a) Foralln > 1, f(n) = 5n?+3n > n? = g(n). So with M = 1 andk = 1, we have
                        | f(n)| => M|g(n)| for all n > & and it follows that f € &(g).
                        c) Foralln > 1, f(n) =5n?+3n>n=h(n). With M = 1 andk = 1, we have | f ()| >
                        M\h(n)| for alln > k andso f € Q(A).
                        d) Suppose that h € Q(f). If so, there exist M € R* andk € Z* withn = |A(n)| >
                       M|f (n)| = M(5n? + 3n) for alln > k. ThenO < M <n/(5n? + 3n) =
                         1/(5n + 3). But how can M be a positive constant while 1/(5n + 3) approaches 0 as n
                         (a variable) gets larger? From this contradiction it follows that h ¢ Q(f).
                     13. a) Forn > 1, f(n) = 0"_, i =n(n + 1)/2 = (n?/2) + (n/2) > (n?/2). Withk = 1 and
                         M = 1/2, we have |f(n)| > M|n?| for alln > k. Hence f € Q(n?).
                         b) PHP         42 4---49? > [n/2]+--- +n? > [n/2}?4+---+ [n/2]? =
                         [(n + 1)/2][n/2]? > n3/8. Withk = 1 and M = 1/8, we have |g(n)| > M|n3| for all n > k.
                         Hence g € Q(n?).
                            Alternatively, forn > 1, g(n) = ye  i? =n(n+1)(2n + 1)/6 = Qn? +.3n? +n)/6 >
                        n> /6. Withk = 1 and M = 1/6, we find that |g(n)| > M|n3| for all n > k—sog € Q(n’).
                        ce) ON        HU 42 4--- tn > [nf/2)i+---4n' > [n/2\'+---4+ [n/2]' =
                         [(n + 1)/2] [n/2]! > (n/2)'t!. Withk = 1 and M = (1/2)'*!, we have |h(n)| > M|n'*"| for
                         alln > k. Hence h € Q(n'*!).
                     15. Proof: f € O(g) > f € Q(g) and f € O(g) (from Exercise 14 of this section) => g € O(f)
                         and g € Q'(f) (from Exercise 12 of this section) > g € O(f).

Section 5.8—p. 300
                       -a) feO(n’)            b) feOn’)       c) fe Om)          d) f € O(log, n)
                         e) f € O(n log, n)
                       . a) Here there are five additions and 10 multiplications.
                         b) For the general case there are n additions and 2n multiplications.
                       . Forn = 1, we find that a; = 0 = [0] = [log, 1], so the result is true in this first case. Now
                         assume the result true for all nm = 1,2, 3,...,&, where k > 1, and consider the cases for
                         na=k+l1.
                               (i) n=k+1=2",        wherem eZ: Here a, = 14+ inj.) = L+aym-1 =
                                   1 + [log, 2"-1) =1+4(m— 1) =m = |log, 2”| = [log n]; and
                             (ii) n=k+1=2"+4r, wherem € Z* andO <r <2”: Here 2” <n <2”+!, so we have
                                  (1) 27! < (n/2) <2”:
                                  (2) 2"-! = [2"-!) < |[n/2| < [2"] = 2”; and
                                  (3)m — 1 = log, 2”~! < log,|[n/2] < log, 2” = m.
                            Consequently, |log,|n/2]] =m — landa, = 1+ 4j,/;2; = 1+ |log,|n/2]] =
                         1+(m — 1) =m = [log, n|. Therefore it follows from the alternative form of the Principle of
                         Mathematical Induction that a, = |log, n} for alln € Z*.
                       » (5/8)n + (3/8)
$-36          Solutions

11. a) procedure LocateRepeat    (n: positive               integer;
                                    A), a, a3,...,a,: integers)
                                  begin
                                     location       :=0
                                     i:=2
                                    while j < nand location = 0 do
                                      begin
                                         j:=1
                                         while j < iand location = 0 do
                                            if a, = a, then location :=1i
                                            else j:=j+1
                                            i:=i+i
                                          end
                                  end {location is the subscript of the first array entry that
                                          repeats    a previous       array entry;    locationis       0 if the array
                                          contains ndistinct           integers. }
                             b)   O(n’)

Supplementary
Exercises —p. 305           . a) If either A or B is 4, then A X B = § = AM B and the result is true. For A, B nonempty we
                             find that:

(x, y)E (A X B)N(BX ADS (yy) EA X Band (x, y)e€BXASD (EA and ye B) and
                             (x€BandyeA)=sxeANBandyeAnB=s          (x, y)€ (ANB) X (ANB); and

(x, y)E (ANB) X (ANB) => (x eEAandxe Byand(ye                  Aandye     B)S (x, v)EAXB
                             and (x, y)€E BX A=> (x, y) €(A X B) ON(B XA).

Consequently, (A X B) M(B X A) = (ANB) X (ANB).
                             b) If either A or B is @, then A X B = @= B X A and the result follows. If not, let
                             (x, vy) € (A X B) U(B X A). Then

(x, y)E(AXB)U(BX ASC,                   y)E AX Bors, y) € (BX A) => (x € Aand y € B) or
                              («Ee BandyeA)SWwecAorxe                 BoandGVycAorve B)Sx,yEeAUBSs
                              (x, vy) € (AUB)
                                           X (AUB).
                            -a) f(D= fd- D=1- fO)+1-fC),so
                                                       fy =0.                              by f(0)=0
                             c)   Proof (by Mathematical Induction): When a = 0 the result is true, so consider a # 0. For
                             n=1, f(a") = f(a) =1-a°- f(a) = na"
                                                               f (a), so the result follows in this first case, and
                             this establishes our basis step. Assume the result true for n = k (= 1) — that is,
                             f(a’) = kak“! f(a). Forn =k +1 we have f(a‘t!) = f(a- a’) =af(a*) +a‘ f(a) =
                             aka‘! f(a) +a‘     f (a) = ka’ f(a) +a     f (a) = (k + l)a*
                                                                                       f (a). Consequently, the truth of the
                             result for 7 = k + 1 follows from the truth of the result for n = k. So by the Principle of
                             Mathematical Induction the result is true for alln € Z*.
                            . (x, y) € (ANB) X (COND) Sx E ANB, yECNDS (xe A, y eC) and
                              @EB         ye DSa,yeAXCand(x,yeBX
                                                        DS (x, y)E(A XK C)N(BX D)
                            .x=1//2andx = /3/2
                            . b) Conjecture: Forn € Z*, f"(x) = a"(x +b) — b. Proof (by Mathematical Induction): The
                              formula is true for n = 1 —by the definition of f(x). Hence we have our basis step. Assume the
                             formula true for n = k (> 1)—that is, f*(x) = a*(x + b) — b. Now consider n = k +.1. We
                             find that f**'(x) = f(f*(x)) = f@*(e +6) — b) = ala’ +b) — b) +b] -b =
                             a‘+'(x + b) — b. Since the truth of the formula at n = k implies the truth of the formula at
                             n = k +1, it follows that the formula is valid for all n € Z* — by the Principle of Mathematical
                             Induction.
                          11. a) (7!)/[2(7°)]
                          13. For | <i < 10, let x, be the number of letters typed on day 7. Then
                              Xp +X. +43 +--+ + xg + X9 + X19 = 84, Or x3 +--+ + xg = 54. Suppose that
                                                                                                                     Solutions          S-37

xy txt x3 < 25,x +43 4x4 < 25,...,xXg 1X9 + X19 < 25. Then
                                     xX) + 2x. + 3x3 +--+ + xg) + 2x9 + X19 < 8(25) = 200, or 3(x3 + +--+ xg)             < 160.
                                     Consequently, we obtain the contradiction 54 = x3 +---+        xg <   uw   = 534.

15.    For | [j-(& — ix) to be odd, (k — i,) must be odd for all 1 <k <n; that is, one ofk, i, must be
                                     even and the other odd. Since n is odd, nm = 2m + 1 and in the list 1, 2,..., n there are m even
                                     integers and m + 1 odd integers. Let 1, 3, 5, ..., , be the pigeons and /,, é3, is, ..., én the
                                     pigeonholes. At most m of the pigeonholes can be even integers, so (k — i,) must be even for at
                                  least onek = 1, 3, 5,..., n. Consequently, { [p-1k — i,) is even.
                              17. Let the n distinct objects be x;, x2, ..., X,. Place x, in a container. Now there are two distinct
                                  containers. For each of x;, x2,..., x, there are two choices, and this gives 2”~! distributions.
                                  Among these there is one where x1, x2,..., Xn, are in the container with x,, SO we remove
                                     this distribution and find S(n, 2) = 2"7' — 1.
                              19. a) and b) m!S(n, m)
                              21. Fix m = 1. Forn = | the result is true. Assume f o f* = f* o f andconsiderf o f*"'.
                                     fof! = fo(fof)=folfiof)=(fofof
                                                           = filo f Hence fof" = frof
                                     for all n € Z*+. Now assume that for some t > 1, f' o f” = f"o f'. Then
                                     filo ft =(fo fof" =folfiof) =folfrofi) =(fo fo fi =
                                     (fo     f)o fi = fro(fof')=         fro fit!     so f"o f"   = f" o f™ forallm, ne Z.
                              23.    Proof: Leta € A. Then f(a) = g(f(f(a@))) = f(g f(fF@)))) = f(g 0 f°@). From
                                     f(a) = g(f(f @)) we have f?(a) = (f o f(a) = f(g(f(f(a)))). So fla) =
                                     f(go Pla) = fief ff@)) = 2?E@ = PeP?@) = fF                        FF@)))                      =
                                     f(g(f(a))) = g(a). Consequently, f = g.
                              25, a) Note that 2 = 2!, 16 = 24, 128 = 27, 1024 = 2!°, 8192 = 2)3, and 65536 = 2'°. Consider
                                  the exponents on 2. If four numbers are selected from {1, 4, 7, 10, 13, 16}, there is at least one
                                     pair whose sum is 17. Hence if four numbers are selected from S, there are two numbers whose
                                     product is 2!’ = 131072.
                                     b) Leta, b,c,d,n€Z*. Let S = {b*, b*4, bo".                  be)      TE [$] + 1 numbers are
                                     selected from S then there are at least two of them whose product is b?"+"4,
                              27.    fog ={a,2). 0,9). @ Oh go f ={@.4).0.0, @ Wh £1 = {@, 2), 0%), (YD:
                                     g!={,y), (0).               DE GOA! ={e.4), 0.2). & DES flog:
                                     g lof l=         {x z). (yy), @, x).
                              29, 23.2? . 3° = 7776 functions
                              31. a) (Toa)(x)=(aon)\(x)=x                Db) a"(xX) =x -—ni ao" (x) =x4+n (n>2)
                                  ec) wm "(x)H=xtnio "(x)=x—n (n>2)
                              33. a) S(8,4) ~~ —b) S(n, m)
                              35. a) Letm = 1 andk = 1. Then foralln > k, | f(n)| <2 <3 < |g(n)| = mlg(n)|, so f € O(g).
                              37. First note that if log, n = r, then n = a’ and log, n = log,(a’) = r log, a = (log, a)(log, 7).
                                  Now let m = (log, a) and k = 1. Then for all n > k, |g(n)| = log, n = (log, a) (log, 2) =
                                     m|f(n)|, so g € O(f). Finally, with m = (log, a)~' = log, b and k = 1, we find that for all
                                     n>k,|f(n)|       = log, n = (log, b)(log, n) = mlg(n)|. Hence f € O(g).

Chapter 6
              Languages: Finite State Machines

Section 6.1—p. 317             1.    a)    25:125      b) 3906       3.12       5. 780
                               7.    a)    (00, 11, 000, 111, 0000, 1111}    b) {0, 1}
                                     c)    E* —{A, 00, 11, 000, 111, 0000, 1111}       d) {0, 1, 00, 11}
                                     e)    =*      f) E*—{0,   1, 00, 11) = a, OL, 10} U {w]||wl]   = 3}
                                    .a)    xe   AC   Sx   =ac,forsomeaeA,cECa>xeEBD,sinceeACB,CCD.
                                  b) If AN # G, let x € AO. x © AO > x = yz, forsome y € A, z € J. Butz € # is impossible.
                                  Hence AY = Y. [In like manner, @A = @.]
                              11. For any alphabet ©, let B C X. Then, if A = B*, it follows from part (f) of Theorem 6.2 that
                                      A* = (B*)* = B* =A.
S-38         Solutions

13. a) Here A* consists of all strings x of even length where if x # A, then x starts with 0 and ends
                             with 1, and the symbols (0 and 1) alternate.
                             b) In this case A* contains precisely those strings made up of 3” 0’s, forn EN.
                             c) Here a string x € A* if (and only if)
                                   (i) x is a string of 0’s, form € N; or
                                  (ii) x is a string that starts and ends with 0, and has at least one 1 and at least two 0’s
                                       between any two 1’s.
                         15. Let © be an alphabet with § # A C b*. If |A| = 1 andx ¢€ A, then xx = x since A? = A. But
                              |xx|| = 2||x|] = |lx|| => |lx]|] =O x =A. Tf|A| > 1, letx € A where ||x|| > 0 but ||x]] is
                                minimal. Thenx € A? => x = yz, for y, z € A. Since ||x|| = | y|| + llz/l, if [ly|l. ||z]] > 0, then
                                one of y, z isin A with length smaller than ||x ||. Consequently, one of || y|| or ||z|] is 0, soA € A.
                         17. If A = A’, then it follows by the Principle of Mathematical Induction that A = A” for all
                             ne Z*. Hence A = A*. By Exercise 15, A = A?            A € A. Hence A = A’.
                         19, By Definition 6.11, AB = {ab|a € A, b © B}, and since it is possible to have a,b, = a2b, with
                                ad|,4.     € A, a      F dz, and b,, b. € B,     b,   F   bo, it follows   that   |AB|   <   |A   X   B|] =   |A||Bl.
                         21. a) The words 001 and 011 have length 3 and are in A. The words 00011 and 00111 have length
                             5 and they are also in A.
                             b) From step (1) we know that 1 € A. Then by applying step (2) three times we get
                                   (i) LEAS OLIEA;
                                  gi) 011 € A= 00111 € A; and
                                 (iii) OO111 Ee AS OOOI111 EA.
                             c) If 00001111 were in A, then from step (2) we see that this word would have to be generated
                             from 000111 (in A). Likewise, 000111 in A = 0011 isin A > Ol is in A. However, there are no
                             words in A of length 2 in—    fact, there are no words of even length in A.
                         23. a)     Steps               Reasons
                                    1) () isin A.       Part (1) of the recursive definition
                                    2) (()) is in A.    Step (1) and part (2-11) of the definition
                                    3) (())() isin A.   Steps (1) and (2) and part (2-i) of the definition
                                b)       Steps                          Reasons
                                         1) ()isinA.                    Part (1) of the recursive definition
                                         2) (()) isin A.                Step (1) and part (2-ii) of the definition
                                         3) (())() isin A.              Steps (1) and (2) and part (2-1) of the definition
                                         4) (())()() isin A.            Steps (1) and (3) and part (2-i) of the definition
                         25.    Length 3: (j)+(j)=3 — Length4:                            (g) + (i) +(@) =5
                                 LengthS:        (j)+(/)+()=8                     Length6:       (§) + (7) + (3) + G) = 13 [Here the summand
                                 (5) counts the strings where there are no 0’s; the summand (7) counts the strings where we
                             arrange the symbols 1, 1, 1, 1, 00; the summand (5) is for the arrangements of 1, 1, 00, 00; and
                             the summand (3) counts the arrangements of 00, 00, 00.]
                         27. A: (1) AEA
                                   (2) IfaeéA, then 0a0, 0a1, laO, lal € A.
                                 B:      (1)   0,1eB.
                                         (2)   Ifa e         B, then 0a0, 0a1, 1a0, lal € B.

Section 6.2—p. 324
                               - a) OO1O10OL; 5;               ~b) 9000000; 5;        ~—c)_- 001000000; so

- a) 010110             ib)
                                                                                                                                              Solutions   §-39

5.   a)   010000; s,                   b)   (s;) 100000;        s2           c)
                                                                  (s>) 000000; s                                      »               °
                                                                  (s3) 110010; s2                                0         1/0            1

SO     SO       S|     0        0

Ss, | Ss)       Sz}    11
                                                                                                          $2 | So         S210            O
                                                                                                          S3     SO       S3     0        |
                                                                                                          $4 | 82         83     10       1
                           d) s;        e)        x = 101 (unique)
                      7, a) (i) 15                 (ii) 349               (iii) 2!>        bb) 6
                      9. a)                        y                  o

0          1/0                #1

So | Sg             Ss; | O             O
                                   Ss, | 83            Sp | 0              O
                                   S2        53        S52        0         ]

8 | 8               8 | 90              O
                                   $4 | 85             s3|0                O
                                   SS        S583                 1        0

b) There are only two possibilities: x = 1111 or x = 0000.
                           c) A= {111}{1}* U {000} {0}*
                           d) Here A = {11111}{1}* U {00000}{0}*.

Section 6.3-—p. 332

Start
                                —_——                                                                    Start

§. b) (i) O11      (ii) 0101       (iii) 00001
                         c) The machine outputs a 0 followed by the first 2 — 1 symbols of the n symbol input string x.
                         Hence the machine is a unit delay.
                         d) This machine performs the same tasks as the one in Fig. 6.13 (but has only two states).
                      7. a) The transient states are so, s;. State s4 is a sink state. {5), 52, 53, 54, 55}, {sq}, and {s2, 53, ss}
                         (with the corresponding restrictions on the given function v) constitute submachines. The
                         strongly connected submachines are {s4} and {s2, 53, 55}.
                         b) States 52, s3 are transient. The only sink state is s4. The set {so, 5), 53, 54} provides the states
                         for a submachine; {so, s;} and {s4} provide strongly connected submachines.
5-40          Solutions

Supplementary
Exercises—p. 334                        . a) True’          b) False                  cc) True’            4d) True~             e)   True _   f) True

pene,
                                3. Letx € © and A = {x}. Then A* = {xx} and (A*)* = {A, x7, x4, 2... A* = fA, x, x72, 207)
                                   and (A*)? = A*, so (A*)* # (A*)’.
                                        - Oo2 = {1, OO}{O} —                     On2 = {O}{1, OO}"{O}                           On = B
                                          Coo = {1, OO}* — {A}        Cio = {1}{1, OO}* U {10}{1, 0O}*
                                        . a) By the pigeonhole principle there is a first state s that is encountered twice. Let y be the
                                          output string that resulted since s was first encountered, until we reach this state a second time.
                                          Then from that point on the output is yyy....
                                          b)n      on
                                                      00         00        0,0   —s
                                          start     OTT,1     ‘   1
                                                                      TT.         \         ?
                                                                                                  Ts   1   S$ i    }
                                                  NN                        01        NF
                                                                      eee         —_—_—"
                                                                            1,0

11. a)
                                                                                       yp                              @

0                          |              QO        |

(so, 53) | (80, $4)                          (81, 53) | 1  1
                                                   (So, $4) | (So, 83)                          (Sty 54) | OI
                                                   (51, 53) | (51, 53)                          (82, 53) | 11
                                                   (s}, $4) | (8), 84)                          (So, S4) | 11
                                                   (s2, 93) | (82, 53)                          (So, 54) | 11
                                                   (s2, $4) | (52, 54)                          (So, 53) | 1 O

b) @((so, 53), 1101) = 1111; M, is in state so, and M> is in state sq.

Chapter 7
             Relations: The Second Time Around

Section 7.1 —p. 343
                                        » a) {U, 1), @, 2), GB. 3), 4.4), CL, 2), 2. 1), @, 3), GB. 2)}
                                          b) {, 1), 2. 2). G, 3), 44.0, 2)}          oo) 1, 1, @, 2), 1, 2), 2. D)
                                        . a) Let fi. f. Re F with fi(n) =n +1, fo(n) = Sn, and f3(n) = 4n 4 1/n.
                                          b) Let 21, 22, g3 © & with g,(n) = 3, g2(n) = 1/n, and g3(n) = sina.
                                        . a)      Reflexive, antisymmetric, transitive                                     _b) Transitive
                                          c) Reflexive, symmetric, transitive   d) Symmetric       e) Symmetric
                                          f) Reflexive, symmetric, transitive   g) Reflexive, symmetric     _h) Reflexive, transitive
                                        . a) Forall x € A, (x, x) ER), Ro, so (x, x) E RK, NR, and KR, NR, is reflexive.
                                          b)   @ (%, yPE RNR          DS , VER, Ar > (Cy,         ER, Ro > CY, ¥) ER, NR, and
                                                   Ry, VR, is symmetric.
                                              (ii) (x, y). (9, 4) ER, NR. => (x, y), (y, x) € Ry, Ro. By the antisymmetry of R,
                                                   (or Rz), x = y and NR, MR, is antisymmetric.
                                             (iti) (x, y), (9, 2) ER, NR. > (x, y), CY, z) EC Ri, Ro > (x, z) E Ri, Ke (transitive
                                                   property) > (x, z) € Ry NR, so RK, M Ry is transitive.
                                        . a) False: letA = {1, 2} and R = {(1, 2), (2, 1)}.
                                          b)        (i) Reflexive: true
                                                   (ii) Symmetric: false. Let A = {1, 2}, R; = {C1, 1)}, and R, = {(1, 1), C, 2)}.
                                                  (iii) Antisymmetric and transitive: false. Let A = {1, 2}, %, = {(1, 2)}, and
                                                       Ry = (C1, 2), (2, D}.
                                          d) True.
                               11.        ad) 60CNET
                                                  ee) 81
                                                         \=QQ=9
                                                         =f) 972
                                                                                                                  vw Is 9 (PE)
                                                                                                                             = OQ =30
                                                                                                                                                                                            Solutions         S-41

13. There may exist an element a € A such that for all b € A, neither (a, b) nor (b, a) is inR.
                     15. r —n counts the elements in & of the form (a, b), a # b. Since R is symmetric, r — n is even.
                     17.    a OO)+QG)+     0G) » OG)+ 0G) + 0G)
                            d) (7) + OG) + OG) + OE)
Section 7.2—p. 354
                            ~-RoFf ={0, 3), 1, H)}: SoR                                      = {C1, 2), C1, 3), C1, 4), (2, 9}:
                            R2 = RK = (1, 4), (2,4), (4.9): 7 = F = {C, 1), C1, 2), C1, 3), C1, 9}
                           - (a, d) € (Ri, Oo Rp) o Rz =| (a, c) E Ry o Ry, (c, d) € R; for some ce C > (a, DV ER,
                             (b,c) € Ry, (c, d) € KR; for some be B, cEC DS (a, by ER, (bh. dV ER, ORD
                             (a, d)         € Ry        ° (Ry       ° Rs),       and   (R,    ° Ry)             ° R3             Cc    Ry         ° (R,        ° Rz)
                           ~ KR, o (Ry2NR3)                     = Ry o {(m, 3), (m, H} = (1, 3), A, H}
                             (Ay o Rr) 1 Hy o Ra) = {1, 3),                                     . HELL, 3), C1, 4} = tC, 3), C1, 49}
                           - This follows by the pigeonhole principle. Here the pigeons are the 2” + 1 integers between 0
                            and 2””, inclusive, and the pigeonholes are the 2"* relations on A.
                            221

. Consider the entry in the ith row and jth column of M(R, oR). If this entry is a 1, then there
                             exists b, € B where 1 <k <n and (a,, by) € Ri, (, c,) € Ky. Consequently, the entry in the
                             ith row and kth column of M(,) is 1, and the entry in the kth row and /th column of M(R2)
                             is 1. This results in a 1 in the ith row and /th column in the product M(R,) - M(R)).
                                  Should the entry in row i and column j of M(&, oR) be 0, then for each b;, where
                              1 <k <n, either (a,, by) ¢ KR, or (by, c,) ¢ Ra. This means that in the matrices M(R,),
                             M(R.), if the entry in the ith row and kth column of M(&) is 1, then the entry in the kth row
                             and jth column of M(2) is 0. Hence the entry in the ith row and jth column of
                            M(R,) - M(Ap) is 0.
                     13. d) Let s,, be the entry in row (x) and column (y) of M. Then s,, appears in row (x) and
                            column (y) of M™. R is antisymmetric <> (sy, = Sy. =1Sx=y)                                                                                                  => MOM"         <I,
                               a            >   d       <_.2    f                5                    x

¥           y

b        .           .                   ¢                        Vv                        y
                                    Ys                                                                                           ;

cf                               (a) |        w                              2          {b)

17,     (i) R= {(a, b), (b, a), (a, &), (@, a), (b,c), (c, b), (b, a), (d, b). (b, e). (e, »), (d, e),
                                 (e. 4), d, f), (f, ad}:

(a)             (6)               (ec)            (@)            fe)         Cf)
                                                                          @foOo                                             1             0              0                  1           0
                                                                          (|     1                                         oO              1              1                 1           0
                                                                      MA=()|0                                               1             0              6                 0            0
                                                                          d)|/  Oo                                          1             OO             68                 1           1
                                                                          ey}    1                                          1             0               1                +0           0
                                                                          (f)|  0                                          0              O               1                60      90

For part (ii) the rows and columns of the relation matrix are indexed as in part (i).
                            (il) R= {(a, b), (b, e), (d, b), (d.c), (e, AD):
                                                                                                                                                                    OoOroeoTneoe
                                                                                                                                                  °
                                                                                                                                       2°
                                                                                                                ooooc oe

or

ore
                                                                                                                                        - co Oo

coo
                                                                                                                                       oo

M(R) =
                                                                                                                                -oCo

ooo

ooo
                                                                                                                                Qo
S-42          Solutions

19,                                                                     R3 and Ki

4

21.   a) 2°                         b) 2"
                                                              1       1   0   0      0        1     1    1    0   0
                                                              1       1   0   0      0        1     1    |    0   0
                          23. a) Ai:|/                        0       0   1   1      0   A: | 1     1    1    0   0
                                                              0       0   1   1 +0            0     0    0    1   1
                                                              0       0   0   0      1        0     00        1   =1
                              b) Given an equivalence relation & on a finite set A, list the elements of A so that elements in
                              the same cell of the partition (see Section 7.4) are adjacent. The resulting relation matrix will
                              then have square blocks of 1’s along the diagonal (from upper left to lower right).
                          25.   (s;)  a:
                                 (82)
                                             aOoWwveraooege

(53)
                                 (4)
                                 (ss)
                                 (56)
                                 (s7)
                                 (sg)
                          27. n = 38

Section 7.3—p. 364

. Foralla € A, be B, we have a QR, a and b Rz b, so (a, b) R (a, b) and& is reflexive.
                               (a,b) R(c, d), (c, d)R (a,b) paRy c,cR, aandbR, d,dRobsea=c,b=d>
                               (a, b) = (c, d), soK is antisymmetric. (a, b) R (c, d), (c,. d) KR (e, f) Sa Ry, c, eR, e and
                               bR, d, d Ry f > aR e,b Ry f = (a, b) R (e, f), and this implies thatR is transitive.
                            ~ W< {1} < {2} < {3} < {1, 2} < {1, 3} < {2, 3} < {1, 2, 3} (There are other possibilities.)
                            . a)       4         b) 3<2<1<4           or      3<1<2<4            e¢) 2
                                    1
                                            /                     2
                                            NZ
                              . Let x, y both be least upper bounds. Then x & y, since y is an upper bound and x is a least
                                upper bound. Likewise, y AR x. R antisymmetric = x = y. (The proof for the glb is similar.)
                          11. Let U = {1, 2}, A = ACU), and let R be the inclusion relation. Then (A, KR) is a poset but not a
                                total order. Let B = {@, {1}}. Then (B X B) NRis a total order.
                          13.   n+ (5)
                                                                                                                 Solutions         §-43

15. a) The n elements of A are arranged along a vertical line. For if A = {a), a2, ..., a, } where
                         a, Ray Raz;R--+ Ra,, then the diagram can be drawn as follows:

b) x!
                     17.         lub     glib      lub         glb      lub      elb       lub             gib               lub       glb
                          a) {1,2}       @      b) {1,2,3}     @     c) {1,2}    6      d) {1,2,3}         {1}        e) {1,2,3}       8
                     19, For each a € Z it follows that a & a because a — a = 0, an even nonnegative integer. Hence R
                         is reflexive. If a, b, c€ Zwitha Rb and bR ce, then

a—b=2m,            for some
                                                                                      m &N
                                                             b—c=2n,            forsomen    EN,

anda —c = (a —b) + (b—c) = 2(m +n), wherem +n EN. Therefore,                    a Rc and KR is
                         transitive. Finally, suppose that a & b and b Ra for some a, b € Z. Then a — b and b — a are
                         both nonnegative integers. Since this can only occur for a — b = b — a = 0, we find that
                         laRbAbRal| >a =b, so Ris antisymmetric.
                             Consequently, the relation & is a partial order for Z. But it is not a total order. For example,
                         2, 3 € Z and we have neither 2 & 3 nor 3 & 2, because neither —1 nor 1, respectively, is a
                         nonnegative even integer.
                     21. b) & c) Here the least element (and only minimal element) is (0, 0). The element (2, 2) is the
                         greatest element (and the only maximal element).
                         d) O, ORO, YVRO2QRA, OAR, INARA, 2) RK 2, 0) RK (2, 1) KR Q, 2)
                     23. a) False. Let U = {1, 2}, A = PU), and let RK be the inclusion relation. Then (A, %) is a
                         lattice where for all S, T € A, lub{S, 7} = SUT and glb{S, T} = SMT. However, {1} and
                         {2} arenot related, so (A, 9) is not a total order.
                     25. a) a       b) a     c)c      de       ez      fhe      gv
                         (A, &) is a lattice with z the greatest (and only maximal) element and a the least (and only
                         minimal) element.
                     27.a) 3        bh) m     oe) 17)  d) m+n+2mn            e) 133
                         f) m+n+k+2(mn+mk+nk)+3mnk — g) 1484
                          h)     m+n+k+£42(mn+mkiml+tnkt+né+kl)+3(mnk                                 +mng+mké+nkl)                +
                                4mnkeé
                     29. 429 = (4) (') sok = 6, and there are 2 - 7 = 14 positive integer divisors of p°q.

Section 7.4—p. 370
                      1. a) Here the collection A;, A>, A3 provides a partition of A.
                         b) Although A = A; U A; U A3 U Ag, we have A; M Az # , so the collection A,, Az, A3, Aq
                          does not provide a partition for A.
                          AK = {0, 1), Cd, 2), 2. D, (2, 2), B, 3), 3.9. 4 3), 44,                6. 5)}
                       . Ris not transitive since 1 R 2 and 2 RK 3 but 1 F 3.
                     Se

a) Forall (x, y)€ Ax ty =x+y > (x, vy) R(x, y).
                               (xy, Vi)R (x2, Ya) => H+ Yr = X2 + yo Xd + 2 = Xr FV => (2, 2) RH, Vi).
                               (xy. vi) R (x2, V2), (2, Yo) R (x3, V3) => Hr +) = 2 + yr, X2 + Yr = 3 + 3, SO
                               xy + y, = x3 + y3 and (x, y)) RK (x3, v3). Since RK is reflexive, symmetric, and transitive, it is
                               an equivalence relation.
                          b)      [C,3)] = {, 3), @, 2), GB, Di 12, Y=           (0, 5), 2. 9, (3. 3), 4G 2), G. D}
                                  [d, D] = {d, D}
5-44         Solutions

ce)    A={0, D}UIC, 2), 2, DP UIC, 3), @, 2), B, DIY {C, 4), @, 3), 3,2), 4. DPV
                                        {(1, 5), (2, 4), (3, 3), 4, 2), . DEY {@, 5), B. 4), (4, 3), GS, 2} U
                                        {(3, 5), (4, 4), GS, 3} V {4, 5), SG, HEU LG, 5}
                           » a) For all (a, b) € A we have ab = ab, so (a, b) R (a, b) and & is reflexive. To see that A is
                             symmetric suppose that (a, »), (c, d) € A and that (a, b) R (c, d). Then (a, bh) R (c,d) >
                             ad = be > cb = da => (c, d) R (a, b), so KR is symmetric. Finally, let (a, b), (c. d),
                             (e, f) € A with (a, b) KR (c, d) and (c, d) R (e, f). Then (a, b) R (c, d) > ad = bc and
                             (c, a) R(e, f) > cf = de, soadf = bcf = bde and since d # 0, we have af = be. But
                             af = be => (a, b) R (e, f), and consequently & is transitive. It follows from the above that R
                             is an equivalence relation on A.
                             b) [(2, 14)] = {@, 14}         [(—3, -—9)] = {(-3, -9), (-1, —3), (4, 12)}
                                   [(4, 8)] = {(—2, —4), C1, 2), G3, 6), (4, 8)}
                             c) There are five cells in the partition
                                                                    — in fact,

A= [(-4, —20)] U [(—3, —9)] U[(-2, —4)] UL(-1, -1D] YI, 14)].
                         11. 9)                 48         026          @ ()+49+29+O+O
                         15. Let {A,};<,; be a partition of a set A. Define 2 on A by x KR y if for some i € 1, we have
                                                                                                                               13 300
                             x,y €A,. For eachx € A, x, x € A, forsomei € J, sox &x                   and
                                                                                                        & is reflexive.
                             xRy=>x, y €A,, forsomei ¢ 1 =               y,x € A, forsomei ¢ 1 >y Rx, so VR is symmetric. If
                             x Ry and yRz, then x, y € A; and y, z € A; for some i, j € J. Since A, M A, contains y and
                             {A;},<; is a partition, from A, 1 A, # @ it follows that A, = A;, soi = j. Hence x, z € A,, so
                             xR z and & is transitive.
                         17. Proof: Since {B,, Bz, B3,..., B,} is a partition of B, we have B = B, UB, U Bz U---UB,.
                             Therefore A = f~'(B) = f-'(B, U---UB,) = f-'(B)) U---U f7'(B,) [by generalizing
                             part (b) of Theorem 5.10]. For 1 <i <j <n, f-'(B,)N f-(B,) = f-'(B; B,) =
                             f-'(@) = @. Consequently, { f~'(B,)|1 <i <n, f~'(B,) 4 GB) is a partition of A.
                             Note: Part (b) of Example 7.56 is a special case of this result.

Section 7.5—p. 376
                           . a) sy and ss are equivalent.     b) s2 and ss are equivalent.
                             €) sz and s7 are equivalent; s3 and s4 are equivalent.
                           - a)    s; and s; are equivalent; s4 and ss; are equivalent.
                             b)     @ 0000                                                                      p              @
                                   (ii) 0                                                       M::'O                1:0           4
                                   (ili) OO
                                                                                                S]        S34       S]     I       0

52        S|        AY]    1       0

53        |S        Ss; |} 1       O
                                                                                                S54       53        S4     0       0

S56       52        S]     I       0

Supplementary
Exercises—p. 378           . a) False. Let A = {1, 2}, J = (1, 2}, R = {(1, LD}, and Ry = {(2, 2)}. Then U,,., R, is
                             reflexive, but neither 92; nor KR is reflexive. Conversely, however, if &, is reflexive for all
                             (actually, at least one) i € /, then Ue; R; is reflexive.
                           . (a,c) ERz oR, = for some be A, (a, b) € Ro, (b, c) E Ry. With R,, Rs symmetric,
                             (b, a) € Ro, (ec, b) E Rj, so (c, a) ER, o Ry CR, oR). (c, a) ER. oR; > (ce, d) ER,
                             (d,a) € R,, for some d € A. Then (d, c) € Ro, (a, d) € R; by symmetry, and
                             (a,c) ER, o Rp, so R. o R, CAR, o R> and the result follows.
                           » (c, a) € (R, ORa)S SS (a, Cc) ER, oR. |] (a, b) ER, (b, c) € Ro, for some bE Bes
                             (b, a) 6 Ri, (c, b) € Ks, for some b € B <> (c, a) €E Rho Ri.
                                                                                                                              Solutions                  S-45

7. LetU = {1, 2,3, 4,5}, A = POU) — {U, HB}. Under the inclusion relation, A is a poset with the
                                  five minimal elements {x}, 1 < x <5, but no least element. Also, A has five maximal
                                  elements  — the five subsets of U of size 4— but no greatest element.
                               9. n=10

11. a)       Adjacency | Index                   b)   Adjacency | Index           °)   Adjacency | Index
                                                  List         List                      List       List                     List             List

1          2     1      |]              |         2    1      1            |          2     1          1
                                              2          3     2        2            2          3   2      2            2           3     2          2
                                              3          1     3           3         3          |   3      3            3           1     3          3
                                              4          4     4      $5             4          5   4      4            4           4     4          6
                                                         )     5        6            5          4   5      5            a)                5          7
                                              6          3     6        8                           6      6            6  I              6          8
                                              7          5                                                              7           #4
                              13. b) The cells of the partition are the connected components of G.
                              15.   One possible order is 10, 3, 8, 6, 7,9, 1,4, 5, 2, where program            10 is run first and program 2 last.
                              17. b) [(0.3, 0.7)] = {(0.3, 0.7)}       [(0.5, 0)] = {(0.5, 0)}       [(0.4, 1)] = {(0.4, 1)}
                                     [(0, 0.6)] = {(0, 0.6), (1, 0.6)}       [(1, 0.2)] = {(0, 0.2), (1, 0.2)}
                                     In general, if0 < a < 1, then [(a, b)] = {(a, b)}; otherwise, [(0, b)] = {(0, b), 1, b)} =
                                          [(1, 6)].
                                    ¢)   The lateral surface of a cylinder of height 1 and base radius 1/27.
                              19,   4” — 2(3") + 2”
                              21. a) (i) BRARC i) BRCRE
                                  BR ARC R F is a maximal chain. There are six such maximal chains.
                                    b) Here 11 &R 385 is a maximal chain of length 2, while 2:2 6 K 12 is one of length 3. The
                                    length of a longest chain for this poset is 3.
                                    ce) (i) BC {1} C {1, 2} C {1, 2, 3} CU; (ii) BE {2} © {2, 3} € {1, 2, 3} OU
                                          There are 4! = 24 such maximal chains.
                                  d) x!
                              23. Leta, AdasR--- Ra,-; Ra, be a longest (maximal) chain in (A, %). Then a, is a maximal
                                  element if (A, &) anda; Ray R-+- Ra,_| is a maximal chain in (B, RK’). Hence the length
                                  of a longest chain in (B, &’) is at least n — 1. If there is a chain b; R’ bo R’--«- R' b,, in
                                  (B, R’) of length n, then this is also a chain of length n in (A, &). But then b,, must be a
                                  maximal element of (A, &), and this contradicts b, € B.
                              25. Ifn = 1, then forall x, ye A, ifx # y then x R y and y R x. Hence (A, &) is an antichain,
                                  and the result follows. Now assume the result true for n = k > 1, and let (A, &) be a poset
                                  where the length of a longest chain is k + 1. If M is the set of all maximal elements in (A, &),
                                  then M #      and M is an antichain in (A, &). Also, by virtue of Exercise 23, (A — M, &’), for
                                  KR’ = (A -— M) X (A— M)) NR, is a poset withk the length ofa longest chain. So by the
                                  induction hypothesis, A— M = C,; UC) U---U Cy, a partition into & antichains.
                                  Consequently, A = C; UC, U--» UC, U M, a partition into k + 1 antichains.
                              27. a) nb) 2"!             ce) 64

Chapter 8
          The Principle of Inclusion and Exclusion
Section 8.1—p. 396
                                1. Let x € S and let n be the number of conditions (from among c), €2, €3, €4) Satisfied by x.
                                    (n =0):           Here x is counted once in N(¢€7¢3€4) and once in N(C)€2€3C4).
                                    (n=     1):       If x satisfies c; (and not c>, c3, c4), then x is counted once in N(¢2¢3C€4) and once in
                                    N(c1€2€3€4).
5-46         Solutions

If x satisfies c;, fori # 1, then x is not counted in any of the three terms in the equation.

(n = 2, 3,4):    If x satisfies at least two of the four conditions, then x is not counted in any of
                                the three terms in the equation.

The preceding observations show that the two sides of the given equation count the same
                                elements from S, and this provides a combinatorial proof for the formula N(c2¢3¢4) =
                                N(c)€2€3€4) + N(€1C2C3€4).
                               .a) 12 b)3       5. a) 534) 458) 16
                         Ge

. 4,460,400 9, (27 )- (G1) + GG) — OG)
                         11.    a5) (3)    -     (2) + AO]           13. 26! — [3(23!) + 24!] + (20! + 21)
                         15.    e- (7)5® + (348 = (3)3° + (5)2* — QJ /6°
                         17.9 '/ (BD? ] -— 3 [7Y/1BY7]] + 361/39 — 3!            19. 651/7776 = 0.08372
                         21. a) 32,      ~b) 96       e) 3200      23. a) 27! ~~ b) 2" '(p- 1)
                         25. a) 1600        _—b) 4399       27. (17) = (32) = (48) = 16
                         29. If 4 divides ¢(n), then one of the following must hold:
                             (1) # is divisible by 8;
                             (2) n is divisible by two (or more) distinct odd primes;
                             (3) n is divisible by an odd prime p (such as 5, 13, and 17) where 4 divides p — 1; or
                             (4) n is divisible by 4 (and not 8) and at least one odd prime.

Section 8.2-p. 401
                               . Ey = 768; EF, = 205; Ey = 40, Ex = 10; Ey = 0; Es = 1.         a    E, = 1024=N
                          3. a) [14!/(2)°] — (7) [13!/(24]+ G) [121/23]
                                                                    — @) [tnty@2p?]+ () 101/21) — (2)[9!]
                                 b) £2 = (9 [12729]- QE) [Nyen'] + HOU0/29
                                                                       - QOL
                                 & Ls= () [1y@p']
                               ~ Ly= 6132; Lz = 6136
                                                     - (3 QAU/20+ QOL
                               a     D3 -yaen’)1/@)                      bb (ee.cpr        ag? a]           G3)
                                c)   (G9) - 3(3) (13) 1/3 )

Section 8.3—p. 403
                               . 10!— (7)9! + G8! — G)7!+ Ger-()s!                 3. 44
                         Sou

~a) T—d,         (d= Me!)        Db) dog = (26!)e7!
                                 n= 11         9. (0Ndiy = (10!)?(e7!)
                               . a) (dio)?
                                      = (10)2e?            db) YO 84(—      (9) [10 — LF
                         13. For all n € Z*, n! counts the total number of permutations of 1, 2,3,..., n. Each such
                             permutation will have k elements that are deranged (that is, there are k elements Xy, X2, 00.4 Xp
                             in{1,2,3,..., n} where x, is not in position x), x2 is not in position x2,..., and x; is nor in
                             position x,) and n — k elements that are fixed (that is, the n — k elements y,, y2,.... ¥,—4 in
                             {1,2,3,..., a}— (x1, x2,..., 4} are such that y, is in position y,, y2 is in position y2,...,
                             and y,_; is in position y,_,).
                                    The n — k fixed elements can be chosen in (,,” ,) ways, and the remaining k elements can
                                then be permuted (that is, deranged) in d, ways. Hence there are (,,” ,)d, = (j)d permutations
                                of 1,2,3,..., n with n — k fixed elements (and k deranged elements). As k varies from 0 ton
                                we count all of the n! permutations of 1, 2, 3, ..., 2 according to the number k of deranged
                                elements.
                                    Consequently,

n= ((,)e + ("\a + (a fees ("a - » (jeu

15.    (Gm — Dt — (@ —2!+ G)@—-3!—--- +)" 16,2 )OD + (12)
                                                                                                                            Solutions        S-47

Sections 8.4
and 8.5—p. 410                        a) (8) + G)8x + (8B Tx? + B)(8-7- Ox? + AB-7-6-S)xt te                                + G)BY28 =
                                            5g (Q) PO, tx!
                                      b) Sieg (YP(a. dx
                                     ~a)                                   + 42°
                                              (i) (1+2x)3) Gi) 14+ 8x + 14x?
                                            (iii) 1+ 9x +25x24+21x* (iv) 14 8x 4+ 16x? 4+ 7x3
                                       b) If the board C consists of n steps, and each step has k blocks, then r(C, x) = (1 + kx)”.
                                     . 5!— 8(4!) +2103!) — 20(2!) +601) = 20     9% a) 20                  db) 3/10
                                     . (61/2!) — 9(5!/2!) + 27(41/2) — 31/2) + 12 = 63

Supplementary
Exercises—p. 413                       1343 [ays] (2) -OG)+ OQ] — & Lon Qe -o!
                                      1 o(—D* (2) (62) 0 — &)! = 1,764,651,461
                                9, Let T = (13!)/(2))°.
                                       a) ([()09/29"] — [()G)@9/25] + [6)(3) 8d) /7
                                       b) [T — (Es + Es)] /T, where Ey = [(7)(9/(2)] — [EQ]                      and Zs = QB
                               11. a) ("~")        13. 84
                               15. a) S; = {1,5, 7, 11, 13, 17}           Sy = {2, 4, 8, 10, 14, 16}
                                      S3 = {3, 15}                        Se = {6, 12}
                                            Sy = {9}                      Sig = {18}
                                       b) [Si] = 6 = ¢(18)        |S3| = 2 = (6)              |So] = 1= (2)
                                            |S2] = 6 = (9)        |S6] = 2 = (3)              |Sig] = 1 = (1)
                               17. a) If nis even, then by the Fundamental Theorem of Arithmetic (Theorem 4.11) we may write
                                       n = 2'm, where k > 1 and m is odd. Then 2n = 2**!m and 6(2n) = (2**') (1 — $) @(m) =
                                       2kb(m) = 2 (2*) (4) Om) = 2 [2* (1 — 3) o(m)] = 2 [@ (24m) ] = 26).
                                       b) When n is odd, we find that @(2n) = (2n) (1 — 5) LL.             (1 — a)    where the product is
                                       taken over all (odd) primes dividing n. (If n = 1, then [|,         (1 — >) is 1.) But
                                       (2n) (1-4) T] in 1-4) =" 11 (1-4) = 90).
                               19,     a) dy(12!)*      bb) ({)a3(12)*    ee) da(din)*

Chapter 9
                     Generating Functions
Section 9.1—p. 417
                                     . a) The coefficient of x” in (1 +x +x?7+---+4+x’)4
                                       b)   The coefficient
                                                       of x7°     in (1 tx+tx74-         ++    x70)?   (1 txPtxt      pee     tx)? or
                                       (ltxtxrt---)? (lta? tat+---)?
                                       c) The coefficient of x* in (x? + x3 +. x4) (xe txt t--.+x3)4
                                       d) The coefficient of x°° in (1 +.x 42° +---+2°°)? (1 txr txt pe. 4x).
                                       (xt x8 tad tee tx”) or(lL¢          x tx t+---)PP (4x? 4+274+--)) (x +x8 +x 4+)
                                     . a) The coefficient of x!° in(1+x+x?+x°+---)®
                                       b) The coefficient of x” in (1 +x +x? +29 4---)"
                                     . The answer is the coefficient of x7! in the generating function
                                                              (lta tax? toh +e. Jl tx4x74---4x"),

Section 9.2—p. 431
                                     .a)    (+x)        b) 8114+)’       ec) (14 x)7!
                                       d) 63/1 +x)           e) (L—x*)        f) x?/(1—ax)
                                       a) g(x) = f (x) — a3x* + 3x° = f(x) + 3-43) x°
                                       b) g(x) = f(x) + B— a3) 2° + (7 —)) x?
S-48         Solutions

c) g(x) =2f(x)
                                             + (1 — 2a) x + 3 — 2a3) x3
                                d) g(x) =2f(x)
                                             + [5/1 — x)] + (1 — 2a, — 5) x + GB — 2a3 — 5) x7 + (7 — 2a, — 5) x?
                            -a (G) bd G*)            7 (19) — 5G) +)
                            -a) 0 bY (73)     - 55)    © (18) +414) + 6(73) +403) + (i)
                         11. (5) — 4 (si) + 6(3)    13. [(is) — (17) G2) + EC) — (2) / (6)
                         15.     (1/8) [1 + (—1"] + 1/4) ("F') + 1/2)" *?)
                         17.     (1 —x—x?-—      x3    —x4—               x9 — x®)7!        = [1    — (x tx?         fe+-+ x6]!
                                 ST+ (etx? tee tx) t (x Fx? tee Fx) 4 (x tar tee tx                                                             peee,
                                              one roll                 two rolls                                            three rolls
                                where the 1 takes care of the case where the die is not rolled.
                               a) 24/27      =   1/8     b)         Qla/2l 7Qn-l       =    2!-[n/2)
                         19,

21. qlin—22)/2]  23. Qin/2)-1e Z(n/2)-1

25. a) Pr(Y¥ = y) = (5/6) '!(1/6), y = 1, 2,3,....
                                b) E(¥Y)=6—            e¢) oy = V30 = 5.477226
                         27. 3/5
                         29. a) The differences are 2, 3, 2, 7, and 0, and these sum to 14.
                             b) {3,5,8,15}      oc) {l+a,14+a4+b),1+a+b4ce,1l+a+b4+ce4+d}.
                         31. Ck =    co i(k—i)f =k? pe i 2k YE P+ LG iP
                                      = (k?) [k(k + 1)/2] — 2k[k(k + 1)(2k + 1)/6] + [K(k + 1)?/4]
                                      = (1/12) (k?) (k? ~— 1)
                         33. a) (14x42? 4x9 424) (O+ x + 2x7 + 3x9 +---) = 2%) c.x' wherecy = 0, c) = 1,
                                7 =14+2=3,¢c3 =1424+3=6.c,=1+2+3+44= 10, and
                                C, =n+(n-1)4+MmM-—2)4+ (—-3)4+(—-4) =5n—- 10 foralln
                                                                                  > 5.
                                b) Q-x4tx?-x94---)(Laxtx?—-x'4---)                                                   = 1/42)?         = (1 +x), the
                                generating function for the sequence (%). (7), (3). (4), .... Hence the convolution of the
                                 given pair of sequences is cp, C1, C2,..., where c, = (57) = (-I)" Ctr!) = (- ("Ft!) =
                                 (—1)"(n + 1),n EN. [This is the alternating sequence 1, —2, 3, —4, 5, -6,7,.... ]

Section 9.3—p. 435
                               »75641,54+2,54141,44+3,44241;4414141;34341534+242;
                                3424+141;34+1414141;2424241,242414141;2+1414+14141;
                                 1+14+14+1414+141
                               . The number of partitions of 6 into 1’s, 2’s, and 3’s is 7.
                               - a) and b)

>         1
                                        (ltx?taxtgah
                                           st \ltxt prt 4e \(L¢ xox 4-0)... = ]] =
                                                                                                                                                  1=]   —       xe

. Let f(x) be the generating function for the number of partitions of n € Z* where no summand
                                 appears more than twice. Then

f=              ]] (4x) 427).
                                                                                                   i=]

Let g(x) be the generating function for the number of partitions of n where no summand is
                                divisible by 3. Here

(x)             1              1             1           1          l
                                                               xX     —            .                 .           =          .             eee

8                  l—-x           l-x*?         Il-x4       1L-x       1- x7

But
                                                                                                                                                    Solutions         $-49

f=            (1+           x 42°) (1 tx? tx!)                       +27 43°) (Ltat $28)
                                                               l-x?             1—-x6           L-x?            l-x!?
                                                                l—-x          1-x?            1-x3              1l-x4
                                                                    l            1               1               1            St
                                                                                           Toe            Tow           Tow                       8).
                                                          “723                Tee

9. This result follows from the one-to-one correspondence between the Ferrers graphs with
                          summands (rows) not exceeding m and the transpose graphs (also Ferrers graphs) that have m
                          summands (rows).

Section 9.4—p. 439
                       1.   a)   e*        b)   e**           ce)       e *      d)       ers        e)    ae’*          f)        xe**
                       3. a) g(x) = f(x) + [3 — a3) /311x°
                            b) g(x) = f(x) + [(-1 — ay) /3!) 03 = e* — [126x3/3))]
                            c)   g(x) = 2 f(x) + [2 — 2a] x + [(4 — 2a) /2!] x?
                                      :                   ,             3        _         nx             1)             ie                  Xe

25         “3         te                      10\4
                       7. The answer is the coefficient of aI in                                aI + a peered To!

9. a) (1/2) [3° +1] / (3%) — by) (1/4) [37°43] / (3)                                                   ee) (1/2) [3° - 1] /(3°)
                            d) (1/2) [3-1]
                                        / (3°) — e) (1/2) [3 +1]
                                                               / (3°)

Section 9.5—p. 442
                       I. a) (1+x4+x7)/(1—x)                                   b) (I t+x4x°4x yan)                                        ce) (1+2x)/(1
                                                                                                                                                     — x)?
                       5. do, d| — dp, 42 —@),03—@,...                                        7. f(x)=          [e*/d —x)]

Supplementary
Exercises—p. 445       1. a) 6/1 —x)4+1/U-—x)                                   b) 1/—-ax)                      o)    1/[1-C +a)x]
                            d) 1/(1—x) + 1/0 — ax)
                     3.5. [(3)  —O@)+OF
                           Let f(x) be the generating function for the number of partitions of n € Z* in which no even
                            summand is repeated (an odd summand may or may not be repeated). Then

f@)=(tx                       440              4-- JL 4x7) (Lee
                                                                                                  4x8 to? te J                                            ta%)--
                                                      1                                   1
                                                “yay FY)
                                                      To Oe)                                                           I—x?

Let g(x) be the generating function for the number of partitions of n € Z* where no summand
                            occurs more than three times. Then

g(x) = (L+xtx?+x°)                            (14x? txt4x°) (Ltxi 4x8 +2’) ---
                                          =[d +x) (1+x’)] [(1 +2?) (1 +x*)] [1 4x°) L+2x°)]---
                                          =[(l-x°)/ /U—x)| (1 +x? UG 27a                                                                  Fa)

[(l-x°)/(L-x*)] (L+2°)-
                                          = (1/(l—x)) (1+x7) (1/(1-x .) (1 +x") (1/ (1—x°)) 1 +x°)--- = ft).

7.   a) 1,5, 5)(7), (M9). HMOAA,...                bi a=4,b=—-3
                       9.   n (2-1)       li. a) (2) ~~ —b) (°)° /(2)
                     13.    a) [a+ (d—a)x]/1—x)?            b) na+(1/2)(n)(n — 1d
                     1S.    a) x" f(x)     b) [ f(x) — (ao Fax tax? +++ + ay—px"!)] /x"                                                                 17. (1—p)"™
5-50         Solutions

Chapter 10
                      Recurrence Relations

Section 10.1—p. 455
                                  1. a) a, = 5@,-1,2 >1,a@9=2        b) a, = —3a,_|.n  > 1,a =6
                                     C) a, = (2/5)a,-1,.n > 1,d, =7
                                  3. d = +(3/7)       5. 141 months      7. a) 145    b) 45
                                  9, a) 21345     ib) 52143,52134 ~— ee) 21534, 21354, 21345

Section 10.2—p. 468
                                  1. a) a, = (3/7)(—1)" + (4/7)(6)".n >O0 — b) a, = 4(1/2)" — 215)", n > 0
                                     €) ad, =3sin(nz/2),n>0         dd) a, = (5—n)3",n>0
                                     e€) dy, = (V2)"[cos(37n/4) + 4 sin(37n/4)], n > 0
                                  3. ad, = (1/10)[7" — (—3)"],n > 0
                                  5. a) dy = 2dn_) + Gn_2,n > 2, a) = l,a; =2
                                      an = (1/22). + V2)"*! —                                 -— V2)"™*1],n = 0
                                      b) a, = Gp—1 + 3ay-2,                     > 2, ay = 1,4, = 1
                                      dy = (1/V 13)                   + V13)/2)"*! — (CL — V13)/2)"*1], 2 > 0
                                      c)   Gn   =    2Gy_}       +   3an_2,    n>2,a9=1,a,=2

a, = (3/4)(3") + A/4)(-D", n= 0
                                 7. a)

Fi = Fi
                                                                                                                — Fo
                                                                                                          fF,   =        Fy   — F,

Fs = Fe
                                                                                                                — F4

Fy)          = Foy, — Fon-2

Conjecture:         For all x € Z, F;             +    F;   + Fs     free           pt   Foy)        =   F>,        - Fo   =   Fy,.
                                Proof (By Mathematical Induction): For n = 1 we have F; = F», and this is true since F, = 1 = F.
                                Consequently, the result is true in this first case (and this establishes the basis step for the proof).
                                Next we assume the result true for n = k (> 1)—thatis, we assume

Fi + F3 + Fs +--+ + Foxy = Fox.

When nv = k + 1, we then find that

Fi + F3 t+ Ps +--+ + Poe                                 + Powys
                                                    = (FU + Fa t+ Fs +--+ + Fae) + Fora = Foe + Focgs = Forge = Faust.

Therefore, the truth for n = k implies the truth at n = k + 1, so by the Principle of Mathematical
                                Induction it follows that for all zn € Z*

Fi + F3 t+ Fs +--+ + Foy                               = Foy.

9 a, = (1/V5)K(C + V5)/2)"*! — (CL ~ V5) /2)"*!], n = 0
                                11. a) a, = G,_) + G,_-2,n > 3,a) = 2.2 = 3:4, = Fuso                                                          n> 1.
                                      b)   by, =     by)        + bya,   7 >3,b,=             1,6,    =     3:6,          =    L,,n>           1,

13. a, = [(8 + 9V2)/16][2 + 4/2)" + [(8 — 9/2)/16][2 — 4./2]}", n > 0
                                15. a, = 2*", where F, is the nth Fibonacci number for n > 0
                                17.   a)   Far             b)    @) F,        Git) Fy-1       Gili) Fyn                       c)     2+2:0,            24+3:1
                                    d) These results provide a combinatorial proof that F,42 = (F, + Rip te                                                           - ++   A) +1.
                                19. (a, a), (B, B)
                                                                                                                  Solutions          S-51

21. a) Proof (By the alternative form of the Principle of Mathematical Induction):

Fy=2= (14 V9)/2 > (14 V5)/2 =a = 07”,
                                                     Fy =3 = (34 V9)/2> G4 V5)/2=a*?,=0
                              so the result is true for these first two cases (where n = 3, 4). This establishes the basis step.
                              Assuming the truth of the statement for n = 3, 4, 5,..., k (> 4), where & is a            fixed (but
                              arbitrary) integer, we continue now with n = k + 1:

Fy    = Fy + Feat
                                                                   > at?    + a k-l)-2

_ ak?        4+ qk3      _   a*-3 (ay 4   1)

k    2
                                                                   =a'3.@°=a k-1                  = gy (k+1)-2_

Consequently, F, > a” * for all n > 3 — by the alternative form of the Principle of
                           Mathematical Induction.
                       23.  An = 2dn—| + An—2,     > 2, do = 1, a) = 3:
                               dy = (1/2) + V2)" +              = ¥2)"*!], 2 = 0
                       25,    (7/10)(7'°) + G/10)(—3)! = 197,750,389
                       27.     An = Gn—| + An—2 + 2dn_3, n> 4, a) = 1, & = 2,43 =5:
                               ad, = (4/7)(2)" + (3/7) cos(2nz /3) + (/3/21) sin(2n7/3),n > 1
                       29.    Xn = 4(2") —3,n>0            31. a, = J/51(4")
                                                                        — 35,n>0
                       33.    Since gced(F,, Fo) = 1 = ged(F2, F,), consider n > 2. Then

F; = Fy + Fi (= 1)
                                                                        Fy = F3+Fy
                                                                        Fs = Fy + F3

Fysi   =     Fa   t+   Fai

Reversing the order of these equations, we have the steps in the Euclidean algorithm for
                              computing the ged of F,,,, and F,, n > 2. Since the last nonzero remainder is F; = 1, it follows
                              that gcd(F,41, F,) = 1 for all n > 2.

Section 10.3—p. 481
                             .a)a,=—(n+1)2,n>0              b) a, =3+n(n—1)?,n=0
                                ©) ad, = 6(2")—5,n>0        dd) a, =2"4+nQ2""'),n>0
                             . a) a, =a,-; +n,n> lay =1             dy, = 1+ [n(n+1)]/2,n>0
                                b) by, = bp-1 +2, > 2, db) = 2,      b, = 2n,n>1,b) = 1
                              a) ay = (3/4)(—1)" — (4/5)(—2)" + 1/20)3)", 2 > 0
                              b) a, = (2/9)(—2)" — (5/6)(1)(—2)" + (7/9), n = 0
                          . dy = A+ Bn+ Cr? — (3/4)n? + (5/24)n4         9, P = $117.68
                       11. a) a, = [(3/4)(3)" — 5(2)” + (7n/2) + (21/4)]'",n 20     b) a, =2,n >0
                       13.  a) t, = 2t,-) +2" '.n>2,t =2:
                                th = (n+1)(22""'),n>1
                            b) t, = 44,1434 '), n> 2.% =4:
                                  t = (14+3n)4"!,n>1
                              ec) ty =(l4+—Dalr’i              n> lr =|)         21.

Section 10.4—-p. 487
                             ~ a) a, = (1/21 +3"1,,n2=0    db) a, = 14+ [n(n — 1)(n — 1)]/6,n > 0
                               C) a, =5(Q2")-4,n>0      da, =2",n>0
§-52         Solutions

3, a) ad, = 2"(1 —2n), by = n(2"*'),n =O
                            b) a, = (-3/4) + 1/2) 4:1) + 1/4"),
                                   = (3/4) + (1/2)(n + 1) — 0/4)G"), n = 0
Section 10.5—p. 493
                                   = (8) /[S6)4)]= 14

Lpe seers
                         NON DN DP 8
                         3,    2n —                         —1\_ [      @n—-1)!                                 (2n — 1)!
                              (",           )- (7            »)- [S34 ]-la Seo
                                                                   _ Ke            —1)!("+              4       _ i     — in       —    |
                                                                        (n+ 1)!(n — 1)!                            (n —1)'(n +1)!
                                                                                  (2n — 1)!
                                                                   =        Sao                                 [(n+l]l)—-(—]

(2n — 1)1(2)                    _ @n - I)'2n)         _            (2n)!
                                                                   ~(@tDin—-D!                                   @t Dial       @+           Dada)
                                                                   _         1                     2n

- aa)
                         5. a) (1/9)(18)                   by LU/MEQ)P -o) LU/6)(8)ILA/9G)]_—                                      dd) /6)('9)
                         7. a)                                                                                                      |
                                                                                                                                    |

|
                                    |          2 (b(cd))
                                                    Cl
                                                            1 | oa               a ((c) (lOc, a)

b)        (iii) ((ab)c)d)e          (iv) (ab) (c(de))
                          9, dy = Apdn—) + A1An-2 + G2Gn—3 +++ TF An-241 + Gn-14
                         Since dp = 1, a) = 1, a2 = 2, and a3 = 5, we find that a, = the nth Catalan number.
                         Weadx                fi@           fi)        fe)                fa)           fs)
                                        1        1           3          2                          2        l
                                        2        2           3          2                          3        3
                                        3        3           3          3                          3        3
                              b) The functions in part (a) correspond with the following paths from (0, 0) to (3, 3).

c) The mountain ranges in Fig. 10.24 of the text.
                              d) For n € Z*, the number of monotone increasing functions f:{1,2,3,..., 8} >
                              {1,2,3,...,n}, where f(i)>i forall l <i <n, isb, = (1/(at+ ING "), the nth Catalan
                              number. This follows from Exercise 3 in Section 1.5. There is a one-to-one correspondence
                              between the paths described in that exercise and the functions being dealt with here.
                         13. (1/(@ + 1))(*”), the nth Catalan number
                                                                                                                                        Solutions                   §-53

15, a)          E3   = 2       b)   E,    =     16
                            c) For each rise/fall permutation, n cannot be in the first position (unless n = 1); 1 is the
                            second component of a rise in such a permutation. Consequently, ” must be at position 2 or
                            4...or2|[n/2].
                            d)        Consider the location of n in a rise/fall permutation «).%2x3 +++ X,-1X, Of 1, 2,3,...,. The
                            number n is in position 27 for some 1 <i < |n/2]. Here there are 2i — 1 numbers that precede
                            n. These can be selected in (3, — |) ways and give rise to £2;_, rise/fall permutations. The
                            (n — 1) — Qi — 1) =n — 27 numbers that follow n give rise to E,,_2, rise/fall permutations.
                            Consequently, £,, =    Wn/?] (3) Bot Enzi, n> 2.
                            g) From parts (d) and (f)

Ee=("~*)eEo+("
                                  n              1        1    &n-2           2!) exept 4(0 "7!
                                                                               3     3Ln-4              2|n/2]          |   VE   2[n/2|-1    E   &n—-2|n/2]

ge =("
                                  n              0   Vee  0Ofn-1          4 ("To )ee
                                                                               2     2   &n-3   4-4     2|(n   77!
                                                                                                                — 1)/2]          Ve   2L(n—1)/2)   E   nm   -2[{n-1)/2]-1

Adding these equations we have

n-|                                          n— 1

2B, = S(O
                                                             YE Eni                             or Ex
                                                                                                    = (1/2) 93 ("7 JE Ent.
                                                                    7=0                                          r= 0

h) £, = 61, Ej = 272
                            i) Consider the Maclaurin series expansions sec x = 1 + x?/2! + 5x4/4! + 61x°/6! + - -- and
                            tan.x = x + 2x3/3! + 16x°/5! + 272x7/7! +... One finds that sec x + tan x is the exponential
                            generating function of the sequence 1, 1, 1, 2, 5, 16, 61, 272, .. .— namely, the sequence of
                            Euler numbers.

Section 10.6—p. 504
                        . a)          f(n) = (5/3)(4n'e84 — 1) and f € O(n'3+) forn € {3'|i EN}
                      —

b) f(n) = 7(log,n+ 1) and f € O(log, n) forn € {5'|i € N}
                          a) f € O(log, n) on {b*|k EN}   b) f € O(n") on {hbk|k EN}
                      wa

-a) f(1)=0 f(a) =2f(n/2)+1
                          From Exercise 2(b), f(2) =n —1.
                          b) The equation f(n) = f(n/2) + (n/2) arises as follows: There are 2/2 matches played in
                          the first round. Then there are n/2 players remaining, so we need f(n/2) additional matches to
                          determine the winner.
                        . O(1)
                      “ss

a)
                                                                                   f(n) <af(n/b) + en
                                                                               af (n/b) < a f (n/b*) + ac(n/b)
                                                                             a’ f(n/b’) <a f(n/b*) +. a?c(n/b’)

a! f(n/bk') < a f(njb') + a*'e(n/b*")

Hence f(n) <a‘ f(n/b*) + en[1 + (a/b) + (a/by +--+ 4+ (a/b)"'] = a fA) +
                            cn[1 + (a/b) + (a/b)? +--+ + (a/b)*‘—'], because n = b*. Since f(1) < ¢ and (n/b*) = 1, we
                            have f(n) <cn[1 + (a/b) + (a/b)? +--+ + (a/b)! + (a/b)*] = (en) _, (a/by'.
$-54         Solutions

c)        Fora # B,
                                                          k                                 1—           (a/b)*+!              _                  ,          _    (a/b)**!

cm) (a/b)                            | T~ (a/b) ]-©0|                                                        1— (a/b)
                                                                                            a lla                                                                      |                                    |
                                                                                       bk       _        (a‘*! /b)                          pet!         _ qkt!                        q**}       _ Pkt!
                                                                       = ¢ | ————— _| = c | ————_                                                                            = cc | —_                      } .

d) From part (c), f(n) < (c/(a — b))[a**! — b*!] = (ca/(a — b))a* — (cb/(a — b))b*. But
                         ak = ql” = ple      and bk = n, so f(n) < (ca/(a — b))n®®" — (cb/(a — b))n.
                              (i) When a < b, then log, a < 1, and f € O(n) on Z*.
                                   (ii) When a > b, then log, a > |, and f € O(n'’®*) on Zt.

Supplementary
Exercises—p. 508              1.           n        ) _                    n!                               _ (n—k)-                          n!                 _ (;             =)        (7)
                                         k+1                  (kK+1)!(2          -—k—-1)!                         (kK+1)              k!~—k)!                              k+1                k
                              3. There are two cases to consider. Case | (1 is a summand): Here there are p(n — 1, k — 1) ways
                                 to partition n — 1 into exactly k — 1 summands. Case 2 (1 is not a summand): Here each
                                 summand $), 52,..., % > 1. For] <i <k, lett, =s,—1> 1. Thenz,%,..., &% providea
                                 partition of n — k into exactly & summands. These cases are exhaustive and disjoint, so by the
                                 rule of sum, p(n, k) = p(n—1,k —1) 4+ p(n —k, k).
                                                              =                                                  Frais         Fp
                              5. b) Conjecture: Forn € Z*, A”                                                                               | where F,, denotes the nth Fibonacci number.
                                                                                                                 Fy                   |
                                         .    _    ~ Ale                                             1          1 — | Fy, 2                Fl                .             oer
                                   Proof: Forn=1,A=A                                            EF              0|    F,                   AI. so the result is true in this case.

Assume the result true for n = k > 1. That is, A‘ =                                                                  ae             ‘          | Forn =k +1,
                                                                                                                                                         k                 k-1

At = Aktl a= aki gg                           Fis                  Fy                 Pol}               |      Fear + Fe                     Frat |         |    Fepo 9 Fray
                                                                                      Fy                   Fy; || 1                   0                      Fy t+ Fey                 Fy                  Fyyy     Fy
                                 Consequently, the result is true for all n € Z*, by the Principle of Mathematical Induction.
                              7. (—1, 0), (a, «), (B, B)
                              9. a)        Since a* = a + 1, it follows thata’? + 1 =2+a@and(2+a)?                                                                                 =44+4a+a?                    =
                                   4(1 +a@)+a? = 5a’.
                                               2n    3                          2n          2                    2k+m      —        Q2k+m

Cc)     >        (72) Fam           =    >          (7)                 [fe                                |
                                           k=0                                  k=0
                                                                                                                      2n                                          2n
                                                                       _               2n\ ak an    2nY\ | a2k am
                                                                       = (1/(@ ~ B)) (er         Life        ‘B
                                                                                                                      =0                                         k=0
                                                                       = (1/(@ — B))[a"(1. + a?) — pr + p’)°"]
                                                                       = (1/(a@ — B))[o"(2 + a)" — B"(2 + B)""]
                                                                       = (1/(a@ — B))[w"((2 + a)*)” — B"(2 + B)*)"]
                                                                       = (1/(@ — B))[a”" (Sa°)" — Bp” (5B7)"]
                                                                       = 5"(I/(a — B))[a?*" — 6") = 5" Fann
                         Ul. c, = Fy42, the (n + 2)-nd Fibonacci number
                         13. a)           Fri             b)    (i)   l=        (1"-3°0)                 (ii)     (n79'1)                 (iil)       (." 979)             (iv)   (,"3°3)           (v)    faust

©) Frat =       io (".°) ~        as (me)
                         15. a) For each derangement, 1 is placed in position i, where 2 <i <n. Two things then occur.
                             Case 1 (7 is in position 1): Here the other n — 2 integers are deranged in d,_» ways. With n — 1
                             choices for i, this results in (n — 1)d,-2» such derangements. Case 2 [/ is not in position | (or
                             position 7)]: Here we consider | as the new natural position for i, so there are n — | elements to
                                                                                                                              Solutions   §-55

derange. With n — 1 choices for i, we have (n — 1)d,_, derangements. Since the two cases are
                                  exhaustive and disjoint, the result follows from the rule of sum.
                                  b)   &=1         c)   d, —nd,-;   = d,-2 — (n — 2)d,-3
                              17. a) a,=("),n>0              b) r=1,s=-4,1=-1/2
                                  d) b, = (1/2n —1))(?"),2 = 1s bp = 0
                              19. c=aorc=8         21. p=-—8
                              23. Gy = GQn-) + Qn_2,
                                                  > 3,4, = 1, a. = 2:4, = Fug, n>                                1
                              25. a) (n=0)   FP) -FoF, —- FZ =1?-0-1-0=1
                                     (n=1) FF -F\Po- FP = 1? -1-1-P?=-1
                                     (n=2) Fi) —FoFy3 —- FF =2?-1-2-P=1
                                     (n = 3) Fj — FyFy— F2 =3? -2-3-2 =-1
                                  b) Conjecture: For n > 0,

F2                                        1           n even
                                                                            Fn Pati — Fy            {1               n odd.
                                                                       —             —_—    2   _            ,

n+l

c) Proof: The result is true for n = 0, 1, 2, 3, by the calculations in part (a). Assume the result true
                              for n = k (> 3). There are two cases to consider    — namely, k even and k odd. We shall establish the
                              result for k even, the proof for k odd being similar. Our induction hypothesis tells us that
                              F., — FFs — Fp = 1. Whenn =k +1 (> 4) we find that
                              Foo — Fai Fr — Fey = (Fe t+)? - Fei (Fia + Fe) — Fi, = Foo +2 FigFe + FP -
                              Feo — Fee — Fy = Fai                  + FR - FR = -LF2 - Fe Pia — FZ] = —1. The result
                              follows for all n € N, by the Principle of Mathematical Induction.
                              27. a) r(C),x)=1+x                     r(C4, x) = 14+ 4x 4 3x?
                                      r(Co,x)=1+2x                   r(Cs5,x)= 14+5x4+   6x? +33
                                       r(C3,x)=14+3x4+x?                   r(Co, x) = 14+       6x + 10x?
                                                                                                       + 4x3
                                  In general, forn > 3, r(C,, x) = r(Cy_1, X)    xr (Cy_2, x).
                                  b) r(C;, 1) =2        r(C3, lI =5      r(Cs, 1) = 13
                                      r(Co, D=3-         r(C4, 1) = 8    r(C,, 1) = 21
                                  (Note: For 1 <i <n, if one “straightens out’ the chessboard C, in Fig. 10.28, the result is a
                                  | X i chessboard —like those studied in Exercise 26.]
                              29. a) The partitions counted in f(n, m) fall into two categories:
                                       (1) Partitions where m is a summand. These are counted in f(n — m, m), for m may occur
                                           more than once.
                                       (2) Partitions where m is not a summand—   so that m — | is the largest possible summand.
                                           These partitions are counted in f(n, m — 1).
                                       Since these two categories are exhaustive and mutually disjoint, it follows that f(n, m) =
                                       f(n—m,m)+ f(n,m — 1).

Chapter 11
              An Introduction to Graph Theory
Section 11.1—p. 518
                               1. a) To represent the air routes traveled among a certain set of cities by a particular airline.
                                  b) To represent an electrical network. Here the vertices can represent switches, transistors, and
                                  so on, and an edge (x, y) indicates the existence of a wire connecting x to y.
                                  c) Let the vertices represent a set of job applicants and a set of open positions in a corporation.
                                  Draw an edge (A, b) to denote that applicant A is qualified for position /. Then all open
                                  positions can be filled if the resulting graph provides a matching between a subset of the
                                  applicants and the open positions.
                               3. 6       5. 953
S-56         Solutions

r44
                             b) {(g, d), (d, e), (e, a)}; {(g, 5), (b,c), (ce, d), (d, e), (e, @)}
                              c) Two: one of {(b, c), (c, d)} and one of {(b, f), Cf, 2), (g, d)}
                              d) No
                              e) Yes. Travel the path {(c, d), (d, e), (e, a), (a, b), (6, f), Cf. 8)}
                              f) Yes. Travel the trail {(g, b), (6, f), Cg), (g. d), (d, B), (b, €), (ce, d), (d, e), (e, a),
                              (a, b)}.
                            . If {a, b} is not part of a cycle, then its removal disconnects a and } (and G). If not, there is a
                              path P from a to b, and P together with {a, b} provides a cycle containing {a, b}. Conversely,
                              if the removal of {a, b} from G disconnects G, then there exist x, y, € V such that the only path
                               P from x to y contains e = {a, b}. If e were part of a cycle C, then the edges in
                              (P — {e}) U(C — {e}) would contain a second path connecting x to y.
                         11. a) Yes        b) No      ec) n-1l
                         13. The partition of V induced by & yields the (connected) components of G.
                         15. The number of closed v — v walks of length n > 1 is F,,,,, the (7 + 1)-st Fibonacci number.

Section 11.2—p. 528
                           .a) 3    b) G; = (U), where U = {a, b, d, f, gh,i, 7};G) = G — {c)
                             c) G2 = (W), where W = {b, c,d, f, 8, i, j}; Go = G — fa, h}
                             d)              be                            e)                         :
                                     ;             Cc            d                      be

—                      f        c            q

3. a) 2?=512         b)3~    oe) 2°
                          5. G is (oris isomorphic to) K,, where n = |V|.
                          7.   (i)    R            Y           Ww       BE                                        (11) No solution
                                   B} 1 |y    eR} 2 |B     y¥/ 3 |R Ww «| Ww
                                              W                       B             Y                 R

(iii)            Ww                      B             Y                R
                                         R    1         Ww   W        2    B    Y   3   R        B   4        Y

Y                       R             B                Ww
                           9. a) No _ b) Yes. Correspond a with u, b with w, ¢ with x, d with y, e with v, and f with z.
                         11. a) If G; = (V,, E,) and G2 = (V2, Ez) are isomorphic, then there is a function f: V; > V>
                              that is one-to-one and onto and preserves adjacencies. If x, y € V, and {x, y} ¢ E), then
                              {f(x), f(y)} € Ex. Hence the same function f preserves adjacencies for G;, G> and can be
                              used to define an isomorphism for G,, G2. The converse follows in a similar way.
                              b) They are not isomorphic. The complement of the graph containing vertex a is a cycle of
                              length 8. The complement of the other graph is the disjoint union of two cycles of length 4.
                         13. If G is the cycle with edges {a, b}, {b, c}, {c, d}, {d. e}, and {e, a}, then G is the cycle with
                              edges {a, c}, {c, e}, fe, b}, {b, d}, and {d, a}. Hence G and G are isomorphic. Conversely, if G
                             is a cycle on n vertices and G, G are isomorphic, then n = $(5), orn = 4(n)(n — 1), andn = S.
                                                                                                       Solutions          $-57

e          d
                      15. a) Here f must also maintain directions. So (a, b) € F, if and only if (f(a), f(b)) € Fo.
                          b) They are not isomorphic. Consider vertex a in the first graph. It is incident to one vertex and
                          incident from two other vertices. No vertex in the other graph has this property.
                      17. nv —3n4+3

Section 11.3—p. 537
                        - a) |Vl|=6 ~~ Db) |[V| =1 or2 or3 or 5 or6 or 10 or 15 or 30
                          (In the first four cases, G must be a multigraph; when |V| = 30, G is disconnected.)
                          c) |V|=6

- a) [Vil = 8 = [Vo]; |E1| = 14 = | Ep
                          b) For V, we find that deg(a) = 3, deg(b) = 4, deg(c) = 4, deg(d) = 3, deg(e) = 3,
                          deg( f) = 4, deg(g) = 4, and deg(h) = 3. For V, we have deg(s) = 3, deg(t) = 4, deg(u) = 4,
                          deg(v) = 3, deg(w) = 4, deg(x) = 3, deg(y) = 3, deg(z) = 4. Hence each of the two graphs
                          has four vertices of degree 3 and four of degree 4.
                          c) Despite the results in parts (a) and (b), the graphs G; and G2 are not isomorphic.
                              In the graph G; the four vertices of degree 4 — namely, f, u, w, and z— are ona cycle of
                          length 4. For the graph G, the vertices b, c, f, and g — each of degree 4— do not lie on a cycle
                          of length 4.
                              A second way to observe that G, and G; are not isomorphic is to consider once again the
                          vertices of degree 4 in each graph. In G, these vertices induce a disconnected subgraph
                          consisting of the two edges {b, c} and { f, g}. The four vertices of degree 4 in graph G» induce a
                          connected subgraph that has five edges    — every possible edge except {u, z}.
                       7a)    19     by) Or,       (4)   (Note: No assumption about connectedness is made here.)
                       9. a) 16    b) 2'° = 524,288
                      11. The number of edges in K,, is (3) = n(n — 1)/2. If the edges of K,, can be partitioned into such
                          cycles of length 4, then 4 divides (5) and (5) = 47, for some ¢ € Z*. For each vertex v that
                          appears in a cycle, there are two edges (of K,,) incident to v. Consequently, each vertex v of K,,
                          has even degree, so n is odd. Therefore, n — 1 is even and as 4¢ = (5) = n(n — 1)/2, it follows
                          that 8t = n(n — 1). So 8 divides n(n — 1), and since n is odd, it follows (from the Fundamental
                          Theorem of Arithmetic) that 8 divides n — 1. Hence n — | = 8k, orn        = 8k 4+ 1, for some
                          keZ.
                      13. d|Vi <     ev deg(v) < A|V|. Since 2)E| = ouev deg(v), it follows that 5|V| < 2|F| < A|V,
                          sod <2(e/n)        <A.
                      15. Start with a cycle vj > v2 —> v3 > +++ > Vox_| > V2, — Vv). Then draw the k edges {v, vziy},
                          {v2, Ugo}, .--, (Ur, Usk}, ..-, {Ug, Vx}. The resulting graph has 2k vertices each of degree 3.
S-58   Solutions

17. (Corollary 11.1). Let V = V; U V2, where V;(V2) contains all vertices of odd (even) degree.
                         Then 2|E| —     vers deg(v) = }) .<y, deg(v) is an even integer. For |V,| odd, },-y, deg(v) is
                         odd.
                           (Corollary 11.2). For the converse let G = (V, £) have an Euler trail with a, b as the
                       starting and terminating vertices. Add the edge {a, »} to G to form the larger graph
                       G, = (V, E,) where G, has an Euler circuit. Hence G, is connected and each vertex in G, has
                       even degree. When we remove edge {a, b} from G,, the vertices in G will have the same even
                       degree except for a, b; deg (a) = deg, (a) — 1, deg,,(b) = deg; (b) — 1, so the vertices a, b
                       have odd degree in G. Also, since the edges in G form an Euler trail, G is connected.
                   19. a) Leta, b,c, x, ye V with deg(a) = deg(b) = deg(c) = 1, deg(x) = 5, and deg(y) = 7.
                       Since deg(y) = 7, y is adjacent to all of the other (seven) vertices in V. Therefore vertex x is
                       not adjacent to any of the vertices a, b, and c. Since x cannot be adjacent to itself, unless we
                       have loops, it follows that deg(x) < 4, and we cannot draw a graph for the given conditions.

21. n odd; n = 2          23. Yes
                   25. a) (i) 13 (ii) 25 (iii) 41 (iv) 2n? —2n +1
                       b) (12      (ii) 24 (iii) 40 (iv) 2n? — 2n
                   27. In any directed graph (or multigraph), )>-, od(v) = |E| = }> ,<y id(v), so
                         >- .cylod(v) — id(v)] = 0. For each v € V, od(v) + id(v) = 2 — 1, so

0=(n~1)-0= J °(@—1)[od(v) — id(v)]
                                                                        vEeV

= J “lod(v) + id(v)]lod(v) — id(v))
                                                          veV

= “od(w))? — Gd),
                                                          veV

and the result follows.
                   29. a) and b)

31. Let |V| =n > 2. Since G is loop-free and connected, for all x € V we have | < deg(x) <
                       n — 1. Apply the pigeonhole principle with the     vertices as the pigeons and the n — | possible
                       degrees as the pigeonholes.
                   33. a) Yes      b) Yes     ec) No
                   35. No. Let each person represent a vertex for a graph. If v, w represent two of these people, draw
                       the edge {v, w} if the two shake hands. If the situation were possible, then we would have a
                                                                                                        Solutions       $-59

graph with 15 vertices, each of degree 3. So the sum of the degrees of the vertices would be 45,
                          an odd integer. This contradicts Theorem 11.2.
                      37. Assign the Gray code {00, 01, 11, 10} to the four horizontal levels: top   — 00; second (from the
                          top) —01; second (from the bottom) — 11; bottom — 10. Likewise, assign the same code to the
                          four vertical levels: left (or, first) — 00; second— 01; third — 11; right (or, fourth) — 10. This
                          provides the labels for p,, p2,..., Pie, where, for instance, p, has the label (00, 00), p2 has
                          the label (01, 00), ..., p7 has the label (11, 01), ..., pi; has the label (11, 11), ..., pis has
                          the label (11, 10), and py. has the label (10, 10).
                              Define the function f from the set of 16 vertices of this grid to the vertices of Q4 by
                          i (ab, cd)) = abcd. Here f ((ab, cd)) = f((a,), €\d\)) => abcd = ayb\c\d, > a =a),
                          b=b,,c=c,d = d, > (ab, cd) = (a,b), c\d,) > f is one-to-one. Since the domain and
                          codomain of f both contain 16 vertices, it follows from Theorem 5.11 that f is also onto.
                          Finally, let {(ab, cd), (wx, yz)} be an edge in the grid. Then either ab = wx and cd, yz differ
                          in one component or cd = yz and ab, wx differ in one component. Suppose that ab = wx and
                          c= y, but d z. Then {abcd, wxyz} is an edge in Q4. The other cases follow in a similar way.
                          Conversely, suppose that { f((aib1, c14))), f (wir, ¥1Z1))} is an edge in Qy. Then a,b,c) d),
                          w\x,¥12Z, differ in exactly one component — say the first. Then in the grid, there is an edge for
                          the vertices (Ob), c,d)), (1b), c,d)). The arguments are similar for the other three components.
                          Consequently, f establishes an isomorphism between the three-by-three grid and a subgraph of
                          Q,4. (Note: The three-by-three grid has 24 edges while Q4 has 32 edges.)

Section 11.4—p. 553
                        . In this situation vertex b is in the region formed by the edges {a, d}, {d, c}, {c, a}, and vertex e
                          is outside of this region. Hence the edge {, e} will cross one of the edges {a, d}, {d, c}, or
                          fa, c}, (as shown).

. a)   Graph     Number of Vertices       Number of Edges
                               Kay                11                      28
                               Ky.)               18                      77
                               Kincn            m+n                       mn
                          b)   m=6
                        . a) Bipartite     b) Bipartite     ¢) Not bipartite
                        - a) (3)(3) b) m(3) +n(3) = (1/2)Gnn)[m +n — 2]
                          c) (m)(n)(m — In — 1) = 4(%)(3)
                        - a) 6 — b) (1/2)(7)(3)(6)(2)(5)
                                                    C1) (4) = 2520              c)   50,295,168,000
                          d) (1/2)(n)Qm)(n — I) (m — 1)(@ — 2)--- 2)           — (m +: 1)))(n — m)
                      11. Partition V as V; U V2 with |V;| = m,|V2| = v — m. If G is bipartite, then the maximum number
                          of edges that G can have is m(v — m) = —[m — (v/2)]? + (v/2)?, a function of m. For a given
                          value of v, when v is even, m = v/2 maximizes m(v — m) = (v/2)[v — (v/2)] = (v/2)*. For v
                          odd, m = (v — 1)/2 orm = (v + 1)/2 maximizes m(v — m) = [(v — 1)/2][v — (Cu — 1)/2)] =
                          [(v — 1)/2](@@ + 1)/2] = [@ + 1)/2][v — (vu + 1)/2)] = (? = 1)/4 = Lv/2)?] < (v/2)’.
                          Hence if |E| > (v/2)*, then G cannot be bipartite.
5-60          Solutions

13. a)                a                       a:{1,2}       f': {4, 5}
                                           “RS                          b: {3,4}      -¢: {2,5}
                                       :       bye       7              c: {1, 5}     h: {2, 3}
                                                     g                  d:{2,4}       i: {1,3}
                                       VEX                              e: {3. 5}     i: {1,4}
                                           d         c

b) G is (isomorphic to) the Petersen graph. [See Fig. 11.52(a).]
                          15.     mn must be even
                          17.     a) There are 17 vertices, 34 edges, and 19 regions, and vy —e +r = 17 — 344 19 =2.
                                  b) Here we find 10 vertices, 24 edges, and 16 regions, and vy —e +r = 10-244 16=2.
                          19,     10
                          21.     If not, deg(v) > 6 for all v € V. Then 2e = > veV deg(v) > 6|V| so e > 3|V], contradicting
                                  e < 3|V| — 6 (Corollary 11.3).
                          23.     a) 2e>kr =k2Q+e-v)            > (2-—ke>k(2-—v) pe < [k/(K -—2)]v—-2)             b4
                                  c) In K33, we have e = 9 and v = 6. [k/(k — 2)](v — 2) = (4/2)(4) = 8 < 9 =e. Since K33
                                  is connected, it must be nonplanar.
                              d) Here k = 5, v = 10, e = 15, and [k/(k — 2)](v — 2) = (5/3)(8) = (40/3) < 15 = e. The
                              Petersen graph is connected, so it must be nonplanar.
                          25, a) The dual for the tetrahedron [Fig. 11.59(b)] is the graph itself. For the graph (cube) in
                              Fig. 11.59(d) the dual is the octahedron, and vice versa. Likewise, the dual of the dodecahedron
                              is the icosahedron, and vice versa.
                              b) Forn € Z*,n > 3, the dual of the wheel graph W, is W, itself.
                          27.

|
                                                         e Sf

im
                          29. a) As we mentioned in the remark following Example 11.18, when G,, G2 are homeomorphic
                                  graphs, then they may be regarded as isomorphic except, possibly, for vertices of degree 2.
                                  Consequently, two such graphs will have the same number of vertices of odd degree.
                                  b) Now if G, has an Euler trail, then G, (is connected and) has all vertices of even
                                  degree   — except two, those being the vertices at the beginning and end of the Euler trail. From
                                  part (a) G2 is likewise connected with all vertices of even degree, except for two of odd degree.
                                  Consequently, G» has an Euler trail. (The converse follows in a similar way.)
                                  c) If G; has an Euler circuit, then G, (is connected and) has all vertices of even degree. From
                                  part (a) G2 is likewise connected with all vertices of even degree, so G, has an Euler circuit.
                                  (The converse follows in a similar manner.)

Section 11.5-—p. 562
                                . a) WA          b) A             c)                d) V7

. a)   Hamiltoncycle:a—> g>k->irhob+c+d>j-fr-e>a
                                  b)   Hamilton cycle:a>d—-+>boe>+g>j-rir foh-+»coa
                                  c)   Hamilton cycle:a—>h>e-> f>g->ir-d>cob-a
                                  d)   Hamilton path:a—-c73d->b->e- fog
                                  e)   Hamilton path:a> b+c+>d->e-> jrivch>go» frkolisem>n->o
                                                                                                                                   Solutions         S-61

f) Hamilton cycle:a +» bocod>e>jrirnh>gol>-mono>o7>tH-
    soroqgopp-kofoa
§. d) If we remove any one of the vertices a, b, or g, the resulting subgraph has a Hamilton cycle.
    For example, upon removing vertex a we find the Hamilton cycle b-> d+>c-— f+ ge
    —> b.
    e) The following Hamilton cycle exists if we remove vertex g:ad > b> c—>d—>e-> jroo
    senoichom+>l+k— f >a. Asymmetric situation results upon removing vertex i.
7. a) (1/2)n—1)!                      b) 10              oo) 9
9. Let G = (V, E) be a loop-free undirected graph with no odd cycles. We assume that G is
    connected   — otherwise, we work with the components of G. Select any vertex x in V, and let
    V, = {v € V|d(x, v), the length of a shortest path between x and v, is odd} and
    V2 = {w € V|d(x, w), the length of a shortest path between x and w, is even}. Note that
    (i) x € Vo, Gi) V = V, U Vy, and (iii) V; MN V2 = @. We claim that each edge {a, b} in E has one
    vertex in V, and the other vertex in V). For suppose that e = {a, b} € E witha, b € V\. (The
    proof for a, b € V» is similar.) Let E, = {{a, vy}, {vy, va}... {Um—1, x}} be the m edges ina
    shortest path from a to x, and let E;, = {{b, vj}, {u;, vb}. ..., {u,_,, x}} be the n edges ina
    shortest path from    to x. Note that m and n are both odd. If {v;, v2, ..., Uni} A {uj}, U3, ---,
    v)_,} = @, then the set of edges E’ = {{a, b}} UE, U E, provides an odd cycle in G.
    Otherwise, let w (# x) be the first vertex where the paths come together, and let E” =

{{a,   b}}       U   {{a,       v1},   {v,       v2},   sees        {v,,    wh}   U   {{b,   vi},   {v,,   U5},   ees   {v,, wh},

for some | <i <m-— 1 and1 < j; <n —1. Theneither E” provides an odd cycle for G or
    E’ — E” contains an odd cycle for G.
11. a)       a                   a

c                    b              c                       b

b)   a                                                    a          .            b

d          .            c

id(a) = 90                      od(a)=3                       id(a)=0
                              id(b) = 1                       od(b) = 1                     id(b) = 2
                              id(c) = 3                       od(c)= 1                      id(c) =2
                              id(d) =2                        od(d) =1                      id(d) =2
         a               b                                    a           .           b
             '                                                           ~t

>   ‘

d       :       c                                    d          -            c

od(a)=1       id(a) =2          od(a) =0     id(a) =3
        od(b) = 1     id(b) =2          od(b) =2      id(b) = 1
        od(c) =2       id(c) = 1        od(c) = 2    id(c)=1
        od(d)=2       id(d) = 1         od(d@) =2    id(d)=1
13. Proof: If not, there exists a vertex x such that (v, x) ¢ & and, forally eV, y # vu, x, if
    (v, y) < E, then (y, x) ¢ E. Since (v, x) ¢ E, we have (x, v) € E, as T is a tournament. Also,
    for each y mentioned earlier, we also have (x, vy) € E. Consequently, od(x) > od(v) + 1—
    contradicting od(v) being a maximum!
15. For the multigraph in the given figure, |V| = 4 and deg(a) = deg(c) = deg(d) = 2 and
    deg(b) = 6. Hence deg(x) + deg(y) > 4 > 3 = 4 — 1 for any nonadjacent       x, y € V, but the
S-62          Solutions

multigraph has no Hamilton path.

17. For n > 5, let C, = (V, E) denote the cycle on n vertices. Then C,, has (actually is) a Hamilton
                              cycle, but for all v € V, deg(v) = 2 < 2/2.
                          19, This follows from Theorem 11.9, since for all (nonadjacent) x, y € V,
                              deg(x) + deg(y) = 12 > 11 = |VJ.
                          21. When n = 5, the graphs Cs; and Cs are isomorphic, and both are Hamilton cycles on five
                              vertices.
                                   For n > 6, let u, v denote nonadjacent vertices in C,. Since deg(u) = deg(v) = n — 3, we
                              find that deg(u) + deg(v) = 2” — 6. Also, 2n —6>n <>n > 6, soit follows from Theorem
                               11.9 that the cocycle C,, contains a Hamilton cycle when n > 6.
                          23. a) The path v —> v) —> v2 -> v3 > +++ —> v,_, provides a Hamilton path for H,,. Since
                              deg(v) = 1, the graph cannot have a Hamilton cycle.
                              b) Here |E| = ("> 3) + 1. (So the number of edges required in Corollary 11.6 cannot be
                              decreased.)
                          25. a) (i) {a,c, f, A}, fa, gg} Gi) {z}, {u,w, y}       —-b) (i) B(G) = 4 (ii) B(G) = 3
                              ec) @3      di3     an)3     (iv)4  (v)6   (vi) The maximum of m andn
                              d) The complete graph on |/| vertices

Section 11.6—p. 571
                                . Draw a vertex for each species of fish. If two species x, y must be kept in separate aquaria,
                                  draw the edge {x, y}. The smallest number of aquaria needed is then the chromatic number of
                                  the resulting graph.
                                -a)3        b)5
                                .a)   P(G,A)=A~AA-1)P
                                  b) For G = K;,, we find that P(G, 4) =A(A — 1)".             X(Kin) = 2
                                » a) 2     b) 2 (n even); 3 (n odd)
                                  c) Figure 11.59(d): 2; Fig. 11.62(a): 3; Fig. 11.85(i): 2; Fig. 11.85(ii): 3d)    2
                                Ja) ()AA—DP2A—-2)            = (2) aAAa—NYa—2~a? —2a +2)
                                  (3) AQ — L(A — 2)? — 5A +7)
                              b) (1)3       (2) 3    (3)3—    e) (1) 720      (2) 1020      (3) 420
                          11. Let e = {v, w} be the deleted edge. There are A(1)(A — 1)(A — 2)--- (A — (n — 2)) proper
                              colorings of G, where v, w share the same color and A(A — 1)(A — 2) --- (A — (n — 1)) proper
                              colorings where v, w are colored with different colors. Therefore, P(G,, 4) =A(A — 1) ---
                              (A—n+2)+AQA—1)---A-n+)D=AQGQ-1)---A-n4+3)A—-—2n                            42), 50
                                 X(Gr) =n~ 1.
                          13.    a) |V| = 2n; |E| = (1/2) Dey deg(v) = (1/2)[4(2) + Qn — 4)3)] = (1/2)[8 + 6n — 12] =
                                 3n—2,n>      1.
                                 b) Forn = 1, we find that G = Kz and P(G, A) =A(A — 1) = AQ — 1)? — 34. +3)!"! 80
                                 the result is true in this first case. For n = 2, we have G = C4, the cycle of length 4, and here
                                  P(G, dA) =AQ—     13 -AA — 1A — 2) = AQ — 1)? — 34 + 3)". So the result follows
                                 for n = 2. Assuming the result true for an arbitrary (but fixed) n > 1, consider the situation for
                                 n+ 1. Write G = G; U Go, where G, is C4 and G2 is the ladder graph for n rungs. Then
                                 G,G, = K>, so from Theorem 11.14 we have P(G, A) = P(G,, 4)- P(G2, A)/P(K2, A)
                                  = [(A)A — DQ? — 3A +3)[A)A— Da? — 3443" ')/ [AA - DI] =
                              (A)(A — 1)? — 3A + 3)”. Consequently, the result is true for all n > 1, by the Principle of
                              Mathematical Induction.
                          15. a) A(A—1)(A-—2) _ b) Follows from Theorem 11.10
                                                                                                      Solutions        S-63

c)   Follows by the rule of product
                      d)                     P(Cy, A) = PC Pai A) — P(Cn-1, 4) = ACA = WY"! = P(Cy-1, A)
                                                      =[A-1I +H @—1)"! = P(C.-1,a)
                                                      =(A- 1)" + (A= 1"! — P(Cr-1, 4),
                            so P(C,, A) — A-— 1)" = (A -— DP! = P(Cy_1, A).
                       Replacing n by n — 1 yields

P(Cn-1.4)- A— 1)! = A= 19"? = P(Cy-2, A).
                       Hence

P(C,, A) — (A — 1)" = P(Cy_2, A) — A         1)".
                       e)   Continuing from part (d),

P(Cy, A) = (A    1)" + (1)
                                                                     3 P(C3, A) -— A= 17
                                                     =(A—1"+(-)"!           [A@ — Da-2)-a-              13]
                                                     = (A—1)" + (-1)"A — 1).
                   17. From Theorem 11.13, the expansion for P(G, 4) will contain exactly one occurrence of the
                       chromatic polynomial of K,,. Since no larger graph occurs, this term determines the degree as n
                       and the leading coefficient as 1.
                   19. a) Forn € Z*,n > 3, let C,, denote the cycle on n vertices. If n is odd then x(C,,) = 3. But for
                       each v in C,, the subgraph C,, — v is a path with n — | vertices and x(C, — v) = 2. Soforn
                       odd C,, is color-critical.
                            However, when n is even we have x(C,,) = 2, and for each v in C,,, the subgraph C,, — v is
                       still a path with n — 1 vertices and x(C,, — v) = 2. Consequently, cycles with an even number
                       of vertices are not color-critical.
                       b) For every complete graph K,, where n > 2, we have x (K,,) =n, and for each vertex v in
                       K,, Kn — v is (isomorphic to) K,_1, so x (K, — v) = n — 1. Consequently, every complete
                       graph with at least one edge is color-critical.
                       c) Suppose that G is not connected. Let G, be a component of G where x(G,) = x(G), and
                       let G2 be any other component of G. Then x(G,) > x(G2) and for all v in G2 we find that
                       x(G — v) = x(G1) = x(G), so G is not color-critical.

Supplementary
Exercises—p. 576     -n=17
                    3. a) Label the vertices of K, witha, b,..., f. Of the five edges on a, at least three have the
                        same color, say red. Let these edges be {a, b}, {a, c}. {a, d}. If the edges {b, c}, {c, d}, {b, d}
                        are all blue, the result follows. If not, one of these edges, say {c, d}, is red. Then the edges
                        {a, c}, {a, d}, {c, d} yield a red triangle.
                        b) Consider the six people as vertices. If two people are friends (strangers), draw a red (blue)
                        edge connecting their respective vertices. The result then follows from part (a).
                      . a) Wecan redraw G> as
                       u         w     y

v         x     Zz

b)   72
                     - a) 1260      —b) 756
                       c) (Case 1: pis odd, p = 2k + 1 for k € N.) Here there are mn paths of length p = 1 (when
                       k = 0) and (m)(n)(m — 1)(n — 1) --- (m —k)(n — &) paths of length p = 2k + 1 > 3.
                       (Case 2: p is even, p = 2k fork € Z*.) When p < 2m (i.e., k < m) the number of paths of
                       length p is (1/2)(m)(n)(m — I)        —1)---@ —K&- I)             —k) + 1/2)        @n)(in — 1) +
S-64         Solutions

(m —1)---(m—(k —1))(n —k). For p = 2m we find (1/2)(2)(m)(n — 1)Qm -—1)---
                                      (m — (m — 1))(n — m) paths of (longest) length 2m.
                                    . a)    Let 7 be independent and {a, b} € FE. If neither a nor bis in V — /, thena,       b € 7, and since
                                      they are adjacent, 7 is not independent. Conversely, if / C V with V — / a covering of G, then
                                      if 7 is not independent there are vertices x, y € J with {x, vy} € E. But {x, y} € E = either x or
                                      yisinV —f7.
                                      b) Let / be a largest maximal independent set in G and K a minimum covering. From part (a),
                                      IK] <|V —7| = |V|—|f[ and |/| > |V— K|=|V|—|K|, or|K| +|7| > |V| >= |K|+]/|.
                              11. Gn = An—| + An-2, Ag = a, = 1            Gy, = Fyi1, the (n + 1)-st Fibonacci number
                              13. Gn = On—\ + 24n-2,4, = 3,42 =5                ay = (—1/3)(—-1)" + (4/3) 2"), n= 1.
                              15. a) y(G) = 2; B(G) = 3; x(G) =4
                                  b) G has neither an Euler trail nor an Euler circuit; G does have a Hamiltonian cycle.
                                  c) G is not bipartite, but it is planar.
                              17. a) x(G)>a@(G). _ b) They are equal.
                              19, a) The constant term is 3, not 0. This contradicts Theorem 11.11.
                                  b) The leading coefficient is 3, not 1. This contradicts the result in Exercise 17 of Section 11.6.
                                  c) The sum of the coefficients is —1, not 0. This contradicts Theorem 11.12.
                              21. a) dy, = Fy42, the (2 + 2)-nd Fibonacci number.
                                      c)    Ai: 34+ Fe           Ay:34+ F;        A323   +   Frys        d)   2)?—l+m

Chapter 12
                           Trees

Section 12.1—p. 585

"NMS                         |INS             DO              TR .             AUT        TO]
                                  b) 5
                               3. a) 47           _—b) 11         5. Paths    7.              b

ed

2              Cc

9, If there is a unique path between each pair of vertices in G, then G is connected. If G contains a
                                   cycle, then there is a pair of vertices x, y with two distinct paths connecting x and y. Hence, G
                                   is a loop-free connected undirected graph with no cycles, so G is a tree.
                              11. n   (5)
                              13. In part (i) of the given figure we find the complete bipartite graph K>.3. Parts (ii) and (iii)
                                   provide two nonisomorphic spanning trees for K>3. Up to isomorphism these are the only
                                   spanning trees for K23.

M                    mM

15. (1) 6      (2) 36
                              17. a})n>m4+1
                                  b) Let k be the number of pendant vertices in 7, From Theorems 11.2 and 12.3 we have
                                      2(n — 1) = 2|E| = do ey deg(v) > k + m(n — k). Consequently,
                                      [2        —-1)>k+m(n —k)]=> [2n —2>k                   +mn    — mk]
                                                                     => [kim—1)>2-—2n+mn=24+                       (m—2)n>24(m—2)(m4+1)
                                                                     =2+m—~—m—-—2=m—m=m(m—1)],
                                      sok >m,
                                                                                                            Solutions   S-65

19. a) If the complement of 7 contains a cut-set, then the removal of these edges disconnects G,
                          and there are vertices x, y with no path connecting them. Hence T is not a spanning tree for G.
                          b) If the complement of C contains a spanning tree, then every pair of vertices in G has a path
                          connecting them, and this path includes no edges of C. Hence the removal of the edges in C
                           from G does not disconnect G, so C is not a cut-set for G.
                      21, a) (i) 3, 4, 6, 3, 8,4       (ii) 3, 4, 6, 6,8, 4
                          b) No pendant vertex of the given tree appears in the sequence, so the result is true for these
                          vertices. When an edge {x, y} is removed and y is a pendant vertex (of the tree or one of the
                          resulting subtrees), the deg(x) is decreased by 1 and x is placed in the sequence. As the process
                          continues, either (i) this vertex x becomes a pendant vertex in a subtree and is removed but not
                          recorded again in the sequence, or (ii) the vertex x is left as one of the last two vertices of an
                          edge. In either case, x has been listed in the sequence [deg(x) — 1] times.
                          c)                         3
                                   2     6    5 /_#4
                                                      7
                              1                     8
                           d) Input: The given Priifer code x;, x2, ..., Xn-2
                              Output: The unique tree T with n vertices labeled with 1, 2, ..., 2. (This tree has the Priifer
                                              code x1, %2,..., Xn_2-)

C := [x], X2,..., Xn—2]        {Initializes C as a list (ordered set)}
                                             L:=([1,2,...,7]                {Initializes L as a list (ordered set)}

for (:= 1ton—2do
                               v:= smallest element in Z not in C
                               w := first entry in C
                               T :=T U{f{v, w}}        {Add the new edge {v, w} to the present forest.}
                               delete v from L
                               delete the first occurrence of w from C
                          T:=TU {{y, z}}               {The vertices y, z are the last two remaining entries in L.}
                      23. a) If the tree contains n + 1 vertices, then it is (isomorphic to) the complete bipartite graph
                          K,., — often called the star graph.
                          b) If the tree contains n vertices, then it is (isomorphic to) a path on n vertices.
                      25. Let E, = {{a, b}, {b, c}, {c, d}, {d, e}, {b, h}, {d, i}, (Ff, i}, fg, }} and
                                 Ey = {{a, h}, {b, i}, {he i}, (g. A}, Cf 8}, fe. t), fd. fh, fe, Fh.
Section 12.2—p. 603
                        . a) fi hk, p.g.s,t    b)a     od
                          d) e. f,j.g,s.t   e) g.t    f) 2              g) ky p.q,s,t
                        . a) /+w-xy*eartz23        b) 04
                      Ge

. Preorder:       vr, j, 4, 2,e,d,b, a,c,   f,i,k,m,   p,s,n,q,t,     vu, wu
                          Inorder:      h,e,a.b,d,c,g, f. j.i,rom.s, pp, k,n, v,t, wg, u
                          Postorder:    a,b,c, d,e, f. g.h,i, j, 8, pom, vu, wit,u, qn, kyr
                        . a) (i) and (iii)        a          (en
S-66         Solutions

b) (i)                     re                    (ii)                        po          (iii)                     ee

oO

4
                                                                                                              ~H
                                                 of                                           ec                                           , f
                                             ,                                               l,                                        é e

9.     G is connected,                                                         Mi

Ve          \         Vy
                         11. Theorem 12.6
                             a) Each internal vertex has m children, so there are mi vertices that are the children of some
                             other vertex. This accounts for all vertices in the tree except the root. Hence n = mi + 1.
                             b) €4+i=n=mi+1l>2=(m-1)i4+1
                             ce) €=(m—)i+1>5:=(-—1)/(n-1)
                                       n=mi+1>t=(n—1)/m
                                 Corollary 12.1
                                 Since the tree is balanced, m"~!                                                  < £ < m" by Theorem 12.7.

m'! < &€<m" = log,,(m"—') < log,,(€) < log,,(m")
                                                                                                                        => (h—1) <log, €<h=h = flog, €|

13. a)         102; 69
                         15. a)                                                                               b) 9:55             c)       A(m—1);(h-1)4+(m-1)

55)
                         17. 21845: 1+m+m*4---4+m"!                                                                     = (m" —1)/(m—-1)
                         19,                                                           {1, 2,3,4} - {9, 10, 11, 12} - {5, 6, 7, 8]

oo
                                      11,2}- (3,4)                 "$9.                       10} — 111,
                                                                                                       12}                                                    —             {5, 6} — {7, 8}
                                                                                                                                                                                  _15}
                                                                                                                                                                                    — {6}
                                 {1} — {2}                                        {3} — [4]                         e                      {1}
                                                                                                                                             - (12)                    eT            {7}- {g}

{1}    @        {2}        {3B}         @        {4}        {9}          @B {10      {11}        B     {12}       {5}   B   6}        {7}     B   {8}
                         21.     (6) (3) (Ga) (3) = 204,204                                                         (zz) (in) (5) G@) = 235,144
                         23. a) 1,2,5, 11, 12, 13, 14, 3, 6, 7, 4, 8, 9, 10, 15, 16, 17
                             b) The pieorder traversal of the rooted tree

Section 12.3—p. 609
                               .a)     L;: 1, 3,5, 7,9 L>: 2, 4, 6, 8, 10
                                 b)    £;:1,3,5,7,...,2m—3,m+n
                                       L,:2,4, 6,8,...,2m—2,2m—1,2m,2m+4+1,...,                                                                                    mt+tn—l
                                                                                                                                                             Solutions   $-67

3. a)                                                       {-1, 0,2, -2, 3, 6, -3, 5, 1, 4}
                                                                                            {0, 2,3, 6,5, 1,4]

Section 12.4—p. 614
                      1. a) tear    bb) tatener                                   c¢) rant
                      3. a: 111          c: 0110                                       e: 10                   g: 11011              i: 00
                         hb: 110101      d: 1100                                      f: Ol                    h: 010               J: 110100
                      §. 55,987

30                                         30
                      7.

10 J \x                                      of         \      20
                                        10 / A           10               /\
                                                                            i\d510 \10
                                       5A            5                «          »
                                       ,                              2           3
                               é
                                   2          3

Amend part (a) of step (2) for the Huffman tree algorithm as follows. If there are n (> 2) such
                            trees with smallest root weights w and w’, then
                             (i) if w< w’ andn — | of these trees have root weight w’, select a tree (of root weight w’)
                                     with smallest height; and
                            (ii) if w = w’ (and all z trees have the same smallest root weight), select two trees (of root
                                     weight w) of smallest height.

Section 12.5—p. 621
                      1. The articulation points are b, e, f, h, j,k. The biconnected components are B,: {{a, b}};
                             Bo:       {{d,       e}};   B3:   {{b,   ch,        {c,   fh;     {f, e},   {e,     b}};   Ba:   {tf   8},   {g, h},   {h,   Sf};

Bs: {{h, th, {i, 7}, GAY}; Bos (7, A}; Bo: {{k, ph. {p,m}, {n,m}, {m, k}, {p, my}.
                           . a) T can have as few as one or as many as n — 2 articulation points. If 7 contains a vertex of
                             degree (n — 1), then this vertex is the only articulation point. If T is a path with n vertices and
                             n — | edges, then the n — 2 vertices of degree 2 are all articulation points,
                             b) In all cases, a tree on n vertices has n — | biconnected components. Each edge is a
                             biconnected component.
                           - X(G) = max{x (B;)|1 <i < ky}.
                           . Proof: Suppose that G has a pendant vertex, say x, and that {w, x} is the (unique) edge in E
                             incident with x. Since |V| > 3, we know that deg(w) > 2 and thatck(G —w) >2>1=x(G).
                             Consequently, w is an articulation point of G.
                           . a) The first tree provides the depth-first spanning tree T for G where the order prescribed for
                             the vertices is reverse alphabetical and the root is c.
                             b) The second tree provides (low’(v), low(v)) for each vertex v of G (and T). These results
                             follow from step (2) of the algorithm.
                                 For the third tree, we find (dfi(v), low(v)) for each vertex v. Applying step (3) of the
                             algorithm, we find the articulation points d, f, and g, and the four biconnected components.
$-68          Solutions

, cq,    1)

i
                                                         9c(1)                                (2, 1)                a2, Ie |
                                                     /                                   j                                  de
                                                 ¢ (2)                                   @ (1, 1)                            /\ %6@,1)
                                                                                     /\                        3,24
                                    (24                   *b@)           f2, 2% vou.)                                       \ i
                                            i\                                      i\                                fe.         be, 2)
                                 gi4)¢               pels)             93,3)¢                wel2,2)                  hy
                                        i        f                              /        /                 94, 3)@           6 a(7, 3)
                                h(S)@        al?)                    A(4, 4)@        (3, 3)                    g¢
                                                                                                               é

AGS, 4)

11. We always have low(x2) = low(x,) = 1. (Note: Vertices x. and x; are always in the same
                              biconnected component.)
                          13. If not, let vy € V where v is an articulation point of G. Then «(G — v) > x(G) = 1. (From
                              Exercise 19 of Section 11.6 we know that G is connected.) Now G — v is disconnected with
                              components H,, Ho,..., H,, fort > 2. For | <i <1t, letv, € H,. Then H, + vis a subgraph of
                              G —u,41,and x(H, + v) < x(G — v4) < x(G). (Here v,,; = v;.) Now let x(G) = n and let
                              {C1}, C2,..., Cn} be a set of m colors. For each subgraph H; + v, 1 <i <t, we can properly
                              color the vertices of H, + v with at most n — 1 colors — and can use c; to color vertex v for all
                              of these ¢ subgraphs. Then we can join these f subgraphs together at vertex v and obtain a
                              proper coloring for the vertices of G where we use less than n (= x (G)) colors.

Supplementary
Exercises —p. 625           . If G is a tree, consider G a rooted tree. Then there are A choices for coloring the root of G and
                              (A — 1) choices for coloring each of its descendants. The result then follows by the rule of
                              product.
                                  Conversely, if P(G, 4) = A(A — 1)""!, then since the factor 4 occurs only once, the graph G
                              is connected. P(G, A) =A(A— 1)" 1 =A2®-—(n— Dat! +--+ (- 1D" A > G hasn
                              vertices and (n — 1) edges. Hence G                                      is a tree [by part (d) of Theorem 12.5].
                            ~ a) 1011001010100
                              b) (i)                                                         (ii)

c) Since the last two vertices visited in a preorder traversal are leaves, the last two symbols in
                              the characteristic sequence of a complete binary tree are 00.
                            . We assume that G = (V, E) is connected    — otherwise we work with a component of G. Since
                              G is connected, and deg(v) > 2 for all v € V, it follows from Theorem 12.4 that G is not a tree.
                              But every loop-free connected undirected graph that is not a tree must contain a cycle.
                            . For 1 <i (<n), let x, = the number of vertices v where deg(v) = i. Then x} +4. +---+
                              Xn-1 = |V| = |E] 4+ 1,80 2|E| = 2(-1 +x) +x) +--+ +%,-1). But 2/E| = )o ey deg(v) =
                              (x; + 2x2 + 3x3 4+-+-+ (n — 1)x,_1). Solving 2(—1 + x) + x2. +--+ +4, 21) = x) +242 +
                              -+++(n — 1)x,_; for x,, we find that x; = 2+ x3 + 2x4 + 3x5 +---4+(n —3)x,_| =
                              2+ Dy aegtn, 23                    [deg(v,)   ~ 2].

. a) G’ is isomorphic to Ks.                                       b) G? is isomorphic to K4.
                              c) G’ is isomorphic to K,,1, so the number of new edges is ("3 ') — n = (5).
                              d) If G* has an articulation point x, then there exists u, v € V such that every path (in G’) from
                              u to v passes through x. (This follows from Exercise 2 of Section 12.5.) Since G is connected,
                              there exists a path P (in G) from u to v. If x is not on this path (which is also a path in G’), then
                              we contradict x being an articulation point in G*. Hence the path P (in G) passes through x,
                                                                                     Solutions    S-69

and we can write P: iu —> uy > +++     > Un_| —> Un —> X D> Vy SD Um_-] D+ ++     Vv} @ Vv. But
    then in G? we add the edge {u,,, v,,}, and the path P’ (in G*) given by P’: u—> uy > +++ >
    Un—| —> Un —> Un —> Um—| > +++ —> Uv; — v does not pass through x. So x is not an articulation
    point of G*, and G? has no articulation points.
11. a) £, = €,-1 + €,-2, forn > 3 and €,; = £2 = 1. Since this is precisely the Fibonacci
    recurrence relation, we have /, = F,,, the nth Fibonacci number, for # > 1.
    b) i, =tn-) +in-2+ 1, n> 3,4) =i. =0
         in = (1//5)oe" — 1/5)" -1= F, —1,n=1
13. a) For the spanning trees of G there are two mutually exclusive and exhaustive cases: (i) The
    edge {x;, y,} is in the spanning tree: These spanning trees are counted in b,,. (ii) The edge
    {x1, ¥,} is not in the spanning tree: In this case the edges {x,, x2}, {¥1, y2} are both in the
    spanning tree. Upon removing the edges {x), x2}, {y:, y2}, and {x,, y,} from the original ladder
    graph, we now need a spanning tree for the resulting smaller ladder graph with n — | rungs.
    There are a,_; spanning trees in this case.
    b) 6, = by; + 2a,-},n > 2
    c) a, — 4a,_-; + a,-2 =O, n > 2
         dn = (1/(2V3))L2 + V3)" — (2 — V3)"], 2 = 0
15. a)   (i) 3     (ii) 5
    D)   a, = Gy-) + y_2,     8 > 5S, a3 = 2, ag = 3
         ayn =  F,4,, the (n + 1)-st Fibonacci number
17. Here the input consists of
    (a) the k (> 3) vertices of the spine   — ordered from left to right as v,, v2, ... , vg;
    (b) deg(v,), in the caterpillar, for all 1 <i <k; and
    (c) n, the number of vertices in the caterpillar, with n > 3.
        If k = 3, the caterpillar is the complete bipartite graph (or star) K,,,-;, for some n > 3.
        We label v, with 1 and the remaining vertices with 2, 3, ..., n. This provides the edge
        labels (the absolute value of the difference of the vertex labels) 1,2, 3,...,n—l,a
        graceful labeling.
        For k > 3 we consider the following.
         1:=2               {/ is the largest low label}
         h:=n-1             {fh is the smallest high label}
         label v, with 1
         label v. with n
          fori :=2tok —1do
             if 2|:/2] =7 then           {i is even}
                begin
                   if v; has unlabeled leaves that are not on the spine then
                        assign the deg(v,) — 2 labels from/ to/ + deg(v;) — 3
                        to these leaves of v,
                   assign the label / + deg(v,) — 2 to v, 4,
                   £:=1+4deg(v,) -— 1
                end
             else
                begin
                   if v, has unlabeled leaves that are not on the spine then
                        assign the deg(v,) — 2 labels from h — [deg(v,) — 3] to
                        h to these leaves of v,
                   assign the label h — deg(v,) +2 to v,44
                   h:=h —deg(v,) + 1
                end
19. a)   1, —1, 1,1,      -1, -1       1,1,—-1,1,—-1, -1        1,—1,1,-1,1,    ~1
S-70         Solutions

"K    J \
                                             \
                                                      J
                                                        f
                                                            KK
                                                             .
                                                             \
                                                              \
                                                               AN
                                                                          \
                                                                             \

In total there are 14 ordered rooted trees on five vertices.
                                    c) This is another example where the Catalan numbers arise. There are (——) (7”) ordered
                                  rooted trees on n + | vertices.
                              21. a) 8     b) 8      ce) 4.83     d) 2(4-8*) — e)                  2(n8")

Chapter 13
                 Optimization and Matching
Section 13.1-p. 638
                               1.   a)   If not, let v, € S, where   1 <i   < mandi    is the smallest such subscript. Then d(vp,                 v,) <
                                    d(vo, Um+1), and we contradict the choice of v,,,; as a vertex v in § for which d(vp, v) isa
                                    minimum.
                                  b) Suppose there is a shorter directed path (in G) from vo to v,. If this path passes through a
                                  vertex in S, then from part (a) we have a contradiction. Otherwise, we have a shorter directed
                                  path P” from vp to vz, and P” only passes through vertices in S. But then P” U {(vx, Ug4y),
                                  (Ugsts Vets     62s Umi. Un), ms Un+1)} is a directed path (in G) from vo to v.41, and it is
                                  shorter than path P.
                               3. a) d(a,b)=5;         d(a,c)=6;      d(a, f)=12;       d(a,g)=16;         d(a,h)= 12
                                    b) f: (a,c). (ce, f)          g: (a, b), (b, h), (A, g)            h: (a, b), (b, h)
                               5. False. Consider the following weighted graph.                    y         2                y,

=
                                                                                                                     Vy

Section 13.2—p. 643
                               1. Kruskal’s algorithm generates the following sequence (of forests), which terminates in a
                                  minimal spanning tree 7 of weight 18.

(1) Fi = {fe, hy}               (2) F, = Fi U {{a, b}}                            (3) Fs = Fy U{{b, c}}
                                            (4) Fy = F3 U Ud, e}}           (5) Fs = Fy U {fe, fh}                            (6) Fe = Fs U {{a, e}}
                                            (7) Fy = Fe Ut{d, gh}           8)Fe=T=FUUS                      i)

(This answer is not unique.)
                               3. No! Consider the following counterexample:                           ‘

Vv        1        w

Here V = {v, x, w}, E = {{v, x}, {x, w}, fv, w}}, and E’ = {{v, x}, {x, wh}.
                               5. a) Evansville-[ndianapolis (168); Bloomington-Indianapolis (51); South Bend—Gary (58);
                                  Terre Haute—Bloomington (58); South Bend—Fort Wayne (79); Indianapolis—Fort Wayne (121).
                                  b) Fort Wayne—Gary (132); Evansville-Indianapolis (168); Bloomington-Indianapolis (51);
                                  Gary—South Bend (58); Terre Haute—Bloomington (58); Indianapolis~Fort Wayne (121).
                               7. a) To determine an optimal tree of maximal weight, replace the two occurrences of “small” in
                                  Kruskal’s algorithm by “large.”
                                  b) Use the edges: South Bend—Evansville (303); Fort Wayne—-Evansville (290);
                                  Gary-Evansville (277); Fort Wayne—Terre Haute (201); Gary-Bloomington (198);
                                  Indianapolis-Evansville (168).
                               9. When the weights of the edges are all distinct, in each step of Kruskal’s algorithm a unique edge
                                  is selected.
                                                                                                                                    Solutions   S-71

Section 13.3—p. 658
                                                             =4     hb) 18
                         ce)      Gi) P= {a,b,h,  d, g, i};     {z}     (ii) «~P = {a, b, h, d, g}; P = {i, 2}
                               (iii) P = {a, h}; P = {b, d, gi, z}
                      3. (1)                b    15,14    d                    (2)

(   86    4 12,12 k

The maximum flow is 32,                                The maximum flow is 23,
                                        which is c{P, P} for                                       which is c{P, P} for
                                P= {a, b, d, g, h} and P= {i, z}                        P= {a} and P= {b, g,i,j,d,h, k, 2

5. Here c(e) is a positive integer for each e € E, and the initial flow is defined as f(e) = 0 for all
                         e € E. The result follows because A,, is a positive integer for each application of the
                         Edmonds-Karp algorithm and in the Ford-Fulkerson algorithm, f(e) — A, will not be negative
                         for a backward edge.
                      7.             b    44     d    64    fF
                               7,7                  4,3          41               5,5
                                          45,0            45,0          45,0

44            "4,2                                 5,5

Section 13.4—p. 665
                      1. 5/(§) = 1/14
                      3. Let the committees be represented as c), C2, ... , Cs, according to the way they are listed in the
                         exercise.
                         a) Select the members as follows: c; — A} c2 — G:c3 — M3 ca — Ni c5 — Ki 06 — R.
                         b) Select the nonmembers as follows: c, — K;c. — Ay c3 — G3e4 — S35 — M36 — P.
                      5. a) Aone-factor for a graph G = (V, £) consists of edges that have no common vertex. So the
                         one-factor contains an even number of vertices, and since it spans G, we must have |V| even.
                         b) Consider the Petersen graph as shown in Fig. 11.52(a). The edges

{e, a}     {b, c}           {d, i}          fg, Jf}          {f, A}
                         provide a one-factor for this graph.
                         c) There are (5)(3) = 15 one-factors for Kg.
                         d) Label the vertices of K>, with 1, 2,3,...,2n— 1, 2n. We can pair vertex 1 with any of the
                         other 2n — | vertices, and we are then confronted, in the case where n > 2, with finding a
                         one-factor for the graph K2,_2. Consequently,

ay, = (2n — 1)dn-1,                    aq, = 1.

We find that
                                     ad, = (2n — l)a,_, = (2n — 1)(2n — 3)ay_2 = (2n — 1)(2n — 3)(2n — S)a,_-3 = ++:

= (2n — 1)(2n — 3)Qa — 5) --- S)G3)CD)
                                       _ (Qn)Qn — 1)(2n — 2)(2n — 3)--- (4)3)2)0)                                 _      @n)!
                                                              (2n)(2n — 2)--- (4)(2)                                  2" (n!)
                      7. Yes, such an assignment can be made by Fritz. Let X be the set of student applicants and Y the
                         set of part-time jobs. Then for all x € X, y € Y, draw the edge (x, y) if applicant x is qualified
                         for part-time job y. Then deg(x) > 4 > deg(y) for all x € X, y € Y, and the result follows from
                         Corollary 13.6.
S-72         Solutions

. a)(i) Select i from A, for 1 <i <4.
                                       (ii) Select i + 1 from A, for 1 <i <3, and 1 from Ag.
                                  b) 2
                              11. For each subset A of X, let G4 be the subgraph of G induced by the vertices in A U R(A). If e
                                  is the number of edges in G4, then e > 4|A| because deg(a) > 4 for all a € A. Likewise,
                                  e < 5|R(A)| because deg(b) < 5 for all b € R(A). So 5|R(A)| > 4|A| and 6(A) = |A| — | R(A)|
                                  < |A| — (4/5)|A| = (1/5)|A| < (1/5)|X| = 2. Then since 6(G) = max{5(A)|A C X}, we have
                                  6(G) <2.
                              13. a) 6(G) = 1. Amaximal matching of X into Y is given by {{x,, ya}, (x2, yo}, (x3, vi},
                                      {x5, y3}}-
                                      b)   If 8(G)    = 0, there is a complete matching of X into Y, and 6(G)        = |Y|, or |¥| =
                                      B(G) — 6(G). If 8(G) =k > 0, let A C X where |A| — |R(A)| =&. Then A U (Y — R(A))
                                      is a largest maximal independent set in G and B(G) = |A| + |¥ — R(A)| =
                                      I¥| + (JA| — |R(A)]) = |¥| + 8(G), so |¥| = B(G) — 6(G).
                                      c)   Fig. 13.30(a): {x), x2, 43, V2, Va, Ys}; Fig. 13.32: {x3, x4, yo, y3, ya}.

Supplementary
Exercises —p. 669              1,       d(a,b) =5          d(a,c)= 11         d(a,d)=7          d(a,e)=8
                                        d(a, f) = 19       d{a, g) =9         d(a, h) = 14
                                      [Note that the loop at vertex g and the edges (c, a) of weight 9 and (f, e) of weight 5 are of no
                                      significance.]
                                    . a) The edge e, will always be selected in the first step of Kruskal’s algorithm.
                                      b) Again using Kruskal’s algorithm, edge e, will be selected in the first application of step (2)
                                      unless each of the edges e), e2 is incident with the same two vertices — that is, the edges e), e3
                                      form a circuit and G is a multigraph.

. There are d,, the number of derangements of {1, 2, 3, .. ., nh}.
                                    . The vertices [in the line graph L(G)] determined by E’ form a maximal independent set.

Chapter 14
                Rings and Modular Arithmetic
Section 14.1—p. 678
                               1.     (Example 14.5): -a =a, —b =e,-c=d,-d=c,-—e=b
                                      (Example       14.6): -s    = s, -f = y, -v   =x,    -w=w,-x       =v,-y=t
                               3. a)         (a+b)+c=(b+a)+ec                             Commutative Law of +
                                                      =b4+(a+c)                           Associative Law of +
                                                      =b+(c+a)                            Commutative Law of +
                                      b)    d+a(b+c)=d+(ab+ac)                            Distributive Law of + over +
                                                                 =(d+ab)+ac               Associative Law of +
                                                     (ab+d)+ac                            Commutative Law of +
                                                   =ab+(d+ac)                             Associative Law of +
                                      c) cld+b)+ab=ab+c(d +b)                             Commutative Law of +
                                                   =ab+(cd+cb)                            Distributive Law of + over +
                                                   =ab+(cb+cd)                            Commutative Law of +
                                                     (ab+cbh)+cd                          Associative Law of +
                                                   =(at+oab+ed                            Distributive Law of » over +
                                                                                     Solutions   $-73

5. a)       (i) The closed binary operation @ is associative. For all a, b, c € Z we find that

(a®b) @c=(a+b-1)         @c=(a+b—-1lh+c-—l=atb+c-z?,

and

ageb@ecd=a9(b+e-—NYN=a4+Ob4+e-]l-l=at+b4+e-2.

(ii)   For the closed binary operation © and all a, b, c € Z, we have

(aOb)Oc=(a+b-ab)Oc=(a+b-ab)+c-(at+b—ab)c
                           =a+bh-—ab+c-—ac~—bet+abe=a+b+c-—ab—ac—
                                                            be+abe,
                  and
           aO(bOc)=a0            (b+c—be)=a+(b+ec—                 be) -—alb+e—be)
                           =at+b+c—bce-—ab-—ac+abe=a+b+c-—ab—ac—be+abe.

Consequently, the closed binary operation © is also associative.
        (iii) Given any integers a, b, c, we find that

(bc)     Oa=(b+c-1lOa=(b+c-1l)+a—-(b+c-—l)a
                              =b+c—1+a-—ba-—ca+a=atat+b+e~—1-—ba-ca,
                  and
        (b6© a) ®(c Oa) = (b+a — ba)             ® (Cc +a ~— ca)
                               =(b+a-—ba)+(ce+a-—ca)-l=at+atbhb4+c-—1-—ba—ca.

Therefore, the second distributive law holds.
   c) Aside from 0 the only other unit is 2, since 2 © 2 = 2 + 2 — (2-2) = 0, the unity for
   (Z, ®, ©).
   d) This ring is an integral domain, but not a field. For all a, b € Z we see that a © b = 1 (the
   zero element)   >    a+b-—ab=1>a(1—b)=(1-b)                 > (a-10-b)=O05a=l1lor
   b = 1, so there are no proper divisors of zero in (Z, ®, ©).
7. From the previous exercise we know that we need to determine the condition(s) on k, m for
   which the distributive laws will hold. Since © is commutative we can focus on just one of these
   laws.
        If x, y,z € Z, then

xO(y 82) = OY)           (Xx Oz2)
                  —>xO(y+z2—k)=w+y—mxy)
                                 B(x +2 —mxz)
                  s>xt+(iyt+z2—k)—mx(y+z72—-—k)
                                 = (x+y —mxy) + (x +z-—mxz)—k
                  =>x+y+z2—-—k—mxy
                        —mxzt+mkx =x+y—mxy+x+z—mxz—k
                  => mkx   =x>mk=13>m=k=lorm=k=-—1,                        sincem,k €Z.

9. a) We shall verify one of the distributive laws. If a, b, c € Q, then

aQ(b@c)
                                =a (b+e4+7)
                                          =at+(b+ec4+7)4+lab+e4+7)]/7
                                          =at+b+e4+74
                                               (ab/7) + (ac/7) +4,
   while
                           (a©b) @(aOc)=(aOb+(aOce4+7
                                                 a+b+(ab/7)+a+e+4(ac/7)
                                                                      +7
                                                 a+b+c+74
                                                     (ab/7) + (ae/7) +4.
   Also, the rational number —7 is the zero element, and the additive inverse of each rational
   numbera is —14 —a.
S-74         Solutions

c) Foreacha € Q,a=aQu=atuc (au/7) > u[l + (a/7)| = 0 > u = O, becausea is
                             arbitrary. Hence the rational number 0 is the unity for this ring. Now let a € Q, where a # —7,
                             the zero element of the ring. Can we find b € Q so that a © b = 0— that is, so thata + b+
                             (ab/7) = 0? It follows thata + b + (ab/7) = 0 => b(1 4+ (a/7)) = -a >
                                = (—a)/[{1 + (a/7)]. Hence every rational number, other than —7, is a unit.
                         11. b) 1, ~—1, i, i

13.     |< A        = (1/(ad — be)) |?                 "|a              ad —be #0
                         15. a) xx =x(ft+y)=axt+uxy=r+y=x
                                 yrH=(athtHxt+tt=t+t=s
                                    yy     =yRtx)=         yl byx=sts=s
                                    tx =(ytx)x             =yx+xxHs4+x=x
                                    ty=(ytx)y=yytxyasty=y
                                b) Since tx = x #7 = xt, this ring is not commutative.
                                c) There is no unity and, consequently, no units.
                                d) The ring is neither an integral domain nor a field.

Section 14.2—p. 684
                               . Theorem 14.10(a). If (S, +, -) is a subring of R, thena — b, abe S foralla, be S.
                                 Conversely, since S # J, leta € 8. Thena ~a =ze€Sandz—a=-~aée               S.Also, ifbe S,
                                 then —b € S,soa — (—b) =a+ be          S,and S is a subring by Theorem 14.9.
                               . a) (ab)(b-!a7') = a(bb")a™! = aua™! = aa! = u and (b-'a~!)(ab) = b'(a'a)b =
                                b- ‘ub = b-'b = u, so ab is a unit. Since the multiplicative inverse of a unit is unique,
                                 (ab)! =6b"'a7!.
                                           ff 2-7                        , fa.          -2                        fo 4 -15
                                bat=[                      1             B=    |)            |              By   =|      2
                                           -lo        16       —39             -la-le            4   —15~
                                    (BA)         [i                  |        pea        [_s           st
                               » (-a)"! = -(a"')
                          wa

~-2ZES,TRBzESNTSSOT                        FO.a,bESAT>a,beSanda,beTa>at+babeS
                                anda+b,abeT=a+b,abeSnT.aeSnTsaeSandaeTs
                                 —~aeéSand—aeT           => -aeSnT.SoSNMT is asubring of XR.
                               . If not, there exist a, b € S witha € T,,a ¢ T, andb € To, b ¢ T,. Since S is a subring of R, it
                                 follows thata + be S.Hencea+beT, ora+be Th.
                                     Assume without loss of generality that a + b € 7,. Since a € T,, we have —a € 7), so by the
                                 closure under addition in 7; we now find that (—a) + (a +b) = (-a+a)+b=bET\,a
                                 contradiction. Therefore, SC 7; U7) > SCT, orSCh.
                         11.
                                    loi] 9 [oo
                             d) S is an integral domain, while R is a noncommutative ring with unity.
                         13. Since za = z, it follows that z € N(a) and N(a) # W.Ifr,, m € N(a), then (7) —72)a =
                             ra —hd =z7—-z72=2,s0r, —m € N(a). Finally, ifr € N(a) ands € R, then (rs)a =
                             (sr)a = s(ra) = sz =z, sors, sr € N(a). Hence N(q) is an ideal, by Definition 14.6.
                         15. 2
                         17. a) ad=aueéaRsinceu € R,soaR # MV. If ar, ar2 € aR, then ar; — ar, = a(r; —7r2) € aR.
                             Also, for ar; € aR andr € R, we have r(ar,|) = (ar,)r = a(ryr) € aR. Hence aR is an ideal
                             of R.
                             b) Leta € R,a #z.Thenag = au eaR soaR = R. Since u € R= aR, u = ar for some
                                 ré R,andr =a"!. Hence R is a field.
                         19, a) (5)(49)     b) 7* — c) Yes, the element (u,u.u,u)         d) 44
                         21. b) If & has a unity u, define a° = u, fora € R, a # z. Ifa isa unit of R, define a              as (a~!)",
                                 forne Zt.
                                                                                                                Solutions     §-75

Section 14.3-p. 696
                            . a) (i) Yes     (ii) No      (iii) Yes —b) (i) No      (ii) Yes (iii) Yes

Cod het
                            . a) —6,1,8,15          b) —9,2,13,24      c¢) —7, 10, 27, 44
                            . Since a = b (mod n), we may write a = b + kn for some k € Z. And m|n > n = ém for some
                              £ € Z. Consequently, a = b+kn =b-4       (k€)m anda =b (mod m).
                            . Leta = 8,b = 2,m = 6, andn = 2. Then ged(m, n) = gcd(6, 2) = 2 > 1,      a =b (modm) and
                              a =b(modn). Buta —b =8 —2=6 # k(12) = k(mn), for any k € Z. Hence
                              a #b(modmn).
                            . Forn odd consider the n — 1 numbers 1, 2, 3,...,2 —3,n—2,n—1as (n — 1)/2 pairs: 1
                                 and (n — 1), 2 and (n — 2), 3 and (n          —3),..., n — (25+) — 1Landn — (*5+). The sum of
                                 each pair is n which is congruent to 0 modulo n. Hence )*"_| i = 0 (mod n). When n is even
                                 we consider the n — 1 numbers 1, 2, 3,..., (2/2) — 1, (n/2), (n/2) 4+ 1,...,n-—3,n—-2,
                                 n— las (n/2) — 1 pairs—namely, | andn — 1, 2 andn —2,3 andn —3,..., (#/2) — 1 and
                                 (n/2) + 1 —and the single number (n/2). For each pair the sum is , or 0 modulo n, so
                                 yr) i = (n/2) (mod n).
                      11. b)        No,2%3 and3K5, but 5 FZ 8. Also, 2% 3 and2 KR 5, but 4 A 15.
                      13. a)        [17]"' = [831]      _b) [100]-' = [111] — e) [777]! = [735]
                      15. a)        16 units, 0 proper zero divisors    _b) 72 units, 44 proper zero divisors
                          c)        1116 units, 0 proper zero divisors
                      17.        [e) + 2(33) + eeyery'|/           (1900)
                      19, a) Forn = 0 we have 10° = 1 = 1(—1)°, so 10° = (—1)° (mod 11). [Since 10 — (—1) = 11,
                          10 = (—1) (mod 11), or 10! = (—1)! (mod 11). Hence the result is also true for2 = 1.] Assume
                          the result true for n = k > 1 and consider the case for k + 1. Then, since 10‘ = (—1)* (mod 11)
                                 and 10 = (—1) (mod 11), we have 10*+! = 10* - 10 = (—1)‘(—1) = (—1)**! (mod 11).
                                 The result now follows for all n € N, by the Principle of Mathematical Induction.
                                 b) If xp X%_—1 °° X2X) Xp = xX, - 10" + x,_-,- 10" 1 +--+ +x.- 10? +x, - 10+ x9 denotes an
                                 (n + 1)-digit integer, then

XnXn—1    0 X21 Xo = (HDX         + (HL)     api   He   x2   — 41 + Xo (mod 11).

Proof:

XpXpae 2 X2XpXo = X%y_- 10" +x,            - 107! +---4+2n-10? +x, -104+ x9
                                                               = x,(—1)" + xy        (- I!   ++   + x2(-1)? + x1 (-1) + x0
                                                               = (-1)"x + (H1)"
                                                                              xg             He   + x2 — xy + Xo (mod 11).
                      21. Let g = gcd(a, n), h = ged(b, n). [4 = b (mod n)] > [a = b+ kn, for some k € Z|] => [g|b
                          and hla]. [g|b and g|n] => gh; [h\a and h|n] = Alg. Since g, h > 0, it follows that g = A.
                      23. (1) Plaintext     a     £   €   g    a    u    €    i    s   ad   i     v      ti dee      a
                          (2)               0    11  Il   6    O   20   11    8   18   3    8    21     8    3   4 «3
                          (3)               3    14  14   9    3   23   14   11  21    6   #11   24    11   6    7     «6
                          (4) Ciphertext    D    O    O   J    D    X   O    L    V    GL         Y¥Y  L    G    H     G
                                                     i      en   t   o   t        h    ry   e€@ @     p   aor        ts
                                                     8     13   19  14  19        7   17    4   4    #15  O    17   19   18
                                                    11     16   22  17  22       10 20      7   7F   18   3    20  22    21
                                                    L      Q    W    R  W        K    U     H   H     §  DU        WwW  YV
                          For each 6 in row (2), the corresponding result below it in row (3) is (6 + 3) mod 26.
                      25. a) (24)(8) = 192        ~—_—b) (25)(20) = S00    ~—se)s (27) (18) = 486 ~— dd) (30)(8) = 240
                      27, a) 9b)       10, 15, 2, 13, 11, 1, 8, 5,9
                      29. Proof: (By Mathematical Induction):
                                 [Note that form > 1, (a” —1)/(a — 1) = a"! +a"? +--+++ 1, which can be computed in the
                                 ring (Z, +, *)-]
S-76         Solutions

When n = 0, a°x9 + c[(a® — 1)/(a — 1)] = x9 + c[0/(a — 1)] = x9 (mod m), so the
                                formula is true in this first basis (1 = 0) case. Assuming the result for n (> 0) we have
                                Xn =a"xg + c[(a” — 1)/(a — 1)] mod m), 0 < x, < m. Continuing to the next case, we learn
                                that
                                                 Xn-) = ax, +c   (mod m)

=ala"x9 + c[(a" — 1)/(a — 1)]] +c (mod m)
                                                     =a"t!yy + ac[(a” — 1)/(a — 1)] + e(a — 1)/(a — 1) (mod m)
                                                     =a"* xy) + c[(a"™*! —a +a —1)/(a — 1)] (modm)
                                                     =a"*|xy + c[(a"*! — 1)/(a — 1)] (mod m)

and we select x,,,) so that 0 < x,,) <m. It now follows by the Principle of Mathematical
                                Induction that

Xn =a"xg+cl(a"        —1)/(a—1)]   (modm),          O<x,   <m.

31. Proof: Letn, n + 1, and n + 2 be three consecutive integers. Then n? + (n + 1)°?4+(n+2)           =
                             m+ (ne 4+3n?+3n4+ 1) 4+ (2 4+ 6n? + 12n + 8) = Bn’ + 15n) + 9(n? + 1). So we
                             consider 3n? + 15n = 3n(n? + 5). If 3|n, then we are finished. If not, then n = 1 (mod 3) or
                                n =2 (mod 3). Ifn = | (mod 3), then n? + 5 =1+4+5 =0 (mod 3), so 3|(n? +5). If
                                n = 2 (mod 3), then n? + 5 = 9 = 0 (mod 3), and 3|(n? + 5). All cases are now covered, so we
                                have 3|[n(n? + 5)]. Hence 9|[3(n* + 5)] and, consequently, 9 divides (3n? + 15n) +
                                924+ 1)=H4+n41)%4(m42).
                                n—-]
                                                       l   2
                         33.    > p(k(nt+ 1), n,n) =   il    ")            the nth Catalan number
                                ToD                  nt+l\n
                         35. a)        112     _—b) 031-43-3464
                         37. a)        1, 28, 14, 34,2,3(=241),      15 (= 144+1,4(=341)                 b) 1,2,3,4,5

Section 14.4—p. 704
                               ~soOtrolvo2wo3,x5>4y>5
                          3. Let (R, +, +), (S, 8, ©), and (7, +’, -') be the rings. For alla, be R, (go fy(a +6) =
                                gs(f(at+b)) = g(f@) ® f)) = g(fla)) +’ g(f()) = (go f(a) +’ (g © f)(B). Also,
                                (go f)(a-b) = g(fla-b)) = g(f(@ © f(b) = g(f@) “ s(f()) =
                                 (go f)(a) -’ (g o f)(b). Hence g o f is a ring homomorphism.
                               . a) Since f(z) = Zs, it follows that zg €¢ K and K # @.Ifx, ye K, then f(x — y) =
                                f(x +(-y)) = f(x) ® f(-y) = Ff) 8 FQ) = zs OZ = Zs, SOx — y € K. Finally, if
                                 x €K andre R, then f(rx) = f(r) O f(x) = f(r) O7zs = zs, and f(xr) = f@yo               f=
                                 zs © f(r) = 25, sorx, xr € K. Consequently, K is an ideal of R.
                                 b) The kernel is {6n|n € Z}.
                               . a)
                                         xX (in Zz) | f(x) Gin Zy X Zs) |} x Gin Zo9) | fOr) Gin Z4 X Zs)
                                             0              (0, 0)               10                 (2, 0)
                                             l              (1, 1)               11                 (3, 1)
                                             2              (2, 2)               12                 (0, 2)
                                             3              (3, 3)               13                 (1, 3)
                                             4              (0, 4)               14                 (2, 4)
                                             5              (1, 0)               15                 G3, 9)
                                             6              (2, 1)               16                 (0, 1)
                                             7              (3, 2)               17                 (1, 2)
                                             8              (0, 3)               18                 (2, 3)
                                             9              (1, 4)               19                 G3, 4)

b) (i) F(CA7)(19) + (12)(14)) = CL, 2), 4) + O, 2)(2, 4) = G, 3) + (0, 3) = G, 1), and
                                          f'B.)=
                                                                                                                      Solutions           S-77

9 . a)4    b) 1c) No
                              11. No. Z, has two units, while the ring in Example 14.4 has only one unit.
                              13. 397 + k(648),k EZ              15. 173 +kQI0),
                                                                               kK EZ

Supplementary
Exercises—p. 708               1. a) False. Let R = Zand$=Z*.                  b) False. Let R = Zand S = {2x|x € Z}.
                                     c) False. Let R = M2(Z) and S =          ls    00 jaca].      d) True.

e) False. The ring (Z, +, +) is a subring (but not a field) in (Q, +, -).
                                     f) False. For any prime p, {a/(p")|a, n € Z, n > 0} is a subring in (Q, +, -).
                                     g) False. Consider the field in Table 14.6.    hh) True.
                                   .a)    fat+a=(ataraa@t+at+at+a                    =(a+a)+(a+a]>                la+a=2a         =7].
                                     Hence —a      = a.
                                     b) Foreacha € R,a +a =z=>a = —a. Fora, be R, (a+b) = (a+b) =
                                     at+abt+bat+h? =a+ab+ba+b>5           ab+ba =z=> ab = —ba = ba, so R is
                                     commutative.
                                   . Since az = z = za foralla€ R, wehavezeC andC # VW. If x, y EC, then
                                     (x+ ya =xa+ya=ax+ay=a(x+y), (xy)a = x(ya) = x(ay) = (xa)y = (ax)y =
                                     a(xy), and (—x)a = —(xa) = —(ax) = a(—x), forallae R,sox+y,xy, -x EC.
                                     Consequently, C is a subring of R.
                                   - b)   Since m, n are relatively prime, we can write 1 = ms + nt wheres,         1 € Z. With m, n > Oit
                                     follows that one of s, t must be positive, and the other negative. Assume (without any loss of
                                     generality) that s is negative so that 1 — ms = nt > 0.
                                          Then a” = b" => (a")' = (b")! => a™ = b™ => alm = bl => aay = b(b™)—. But
                                     with —s > O and a™ = b”, we have (a)             = (b”)~". Consequently,

(ay? = bY           # za [aay               = be")       > a = b,
                                     since we may use the Cancellation Law of Multiplication in an integra] domain.
                                   . Letx    =a,   +h,y=a       + bo, for a), a   € Aand   bj, bs € B. Then   x   — y =   (4, — a2) +
                                  (b, —by)€A+B.fre Randa+beA+B,witha                   €Aandbe B,thenrace A,rbe B,
                                  andr(a+b)¢ A+B. Similarly, (a + b)r ¢ A+ B,and A + B is an ideal of R.
                              11. Consider the numbers x, x; +.%2, X) +X. +.2%3,...,%; $42 + x3 4+-+-+++-x,. If one of these
                                     numbers is congruent to 0 modulo n, the result follows. If not, there exist | <i         < / <n     with
                                     (xy $x        eee $x) Sy      tee    +x, Hx 4) +--+ +-%,) (mod n). Hence n divides
                                     Qua tee +3,).
                              13. a) 1875      ~b) 2914s  ¢) 3/16
                              15. Proof: For all n € Z we find that n? = 0 (mod 5) — when 5|n — or n? = 1 (mod 5) or
                                     n? = 4 (mod 5). Suppose that 5 does not divide any of a, b, or c. Then
                                       (i) a? +b? +c? =3 (mod 5)— whena’? =}? =c’? = 1 (mod 5);
                                      (ii) a? + b? +c? = 1 (mod 5) — when each of two of a’, b’, c? is congruent to 1 modulo 5
                                            and the other square is congruent to 4 modulo 5;
                                     (iii) a? + b* + c* = 4 (mod 5) — when one of a’, b’, c? is congruent to | modulo 5 and each of
                                            the other two squares is congruent to 4 modulo 5; or
                                  (iv) a? +b? +c? =2 (mod 5)— when a’ = b* =c’ = 4 (mod 5).
                              17. (¢, + I){e2 +1)---(&+1) -1

Chapter 15
          Boolean Algebra and Switching Functions
Section 15.1—p. 718
                               1. a) |    b)1    el dil       3. a) 2”                     b)   22”
                              5,  a) dnf.   XYZ tXVZAXVZT
                                                        + XYZ + XYZ
                                          c.nf.    @&+yt+2e+y+Da+y¥t+zZ)
S-78         Solutions

b) f=
                                 > m(2.                        4.5.6.7) =|]                     MOO, 1,3)
                           . a) 2%             =) 2°)               28
                           ~mtk=2"                       ll. a)          y+uz%                   Db x+y     Cc)       wxetz
                         13. a)         (i)                                       =                          =                      =
                                               f|el|Al                  fe | fh | eh | fet+fh+eh | fetfh
                                               0010/0]  0                              0 |         0              0            0
                                               olo/l1]o                                1           0              1            1
                                               0     l         0         0             0           0              0            0
                                               oli/1i]o0                                1  1                      1            1
                                               1}o]o!}                   o             nO)                        0            0
                                               1/o/11]              0                  0 | 0                      0            0
                                               1/1/o0]              1                  0   0                      1            1
                                               1/1/}/1]                  1             0   1                      1            1

Alternatively, fg + Fh = (fe + f(fg +h) = (F + P(e + fA(fe +h) =                                 -
                               lig + fo(fg +h) = feg+eht+
                                                       ffet fh=fe+gh+0g+
                                                                      fh= fe+eh+ fh.
                               qi) fet+ fe+fe+fe=fige+a+fgeta=f-1l+f-l=f+f=i1
                             b @ (f+a(f+herh)=(f+af         +h)
                                       Gi) (f+ 9               +OF+9F +H =0
                         15. a         fef=0;             fOof=1                                fel=f;      feo-f
                             b) @ f@g=0                            fE+ fg =0= fE= fg =0.[f = land fz=0|=>¢=1.
                                              [f =Oand fg = 0] =                       g = 0. Hence f = g.
                                       ai) fegs=fe+fs=fetfe=fe+fe=feeg
                                       (iv) This is the only result that is not true. When f has value 1, g has value 0 and # value |
                                            (or g has value  | and # value 0), then f @ gh has value  1 but (f @ g)(f @h) has
                                            value 0.
                                       (v) fe ® fh = fefh+ fefh =(F+afh+ feF +h) =
                                              ffA+ feht+ ffet fah= feh+ feh= fh + gh) = f(g @h)
                                       (vi) fOg=fer+fs=fs+fe=feog
                                            fes=fst+fe=(f+stf+a=fe+fe=f
                                                                      eg
Section 15.2—p. 727
                           -a)x@y=(e+y)GyY)                                      y——
                                                                                                                              x@y
                                                                             x   —

b) xy                 TD)             [>                      wy     Oxy         ;           >   oy

» f(w, x,y, 2) = wxyzt(w+xt+y)z
                         a

x

ea
                                                           x

(a)                     (b)

a) The output is (x + y)(x + vy) + y. This simplifies tox + Gy) +y=x+0+y=x4+y,
                             and provides us with the simpler equivalent network in part (a) of the figure.
                                                                                                                                                             Solutions      S-79

b) Here the output is (x + ¥) + @      ¥+ y), which simplifies tox ¥+X7+y =
                                xXy+xyt+y=xt+y)+y=x0)+y =X +4 vy. This accounts for the simpler
                                equivalent network in part (b) of the figure.
                            ~ a) f(w.x,yy=xy+xy       Db) f(w.x.y)=x    c) f(w, x,y,z) = xXZ4-
                                                                                             XZ
                               d) f(w, x,y,z) =wyZ+xyztwyzt+xyzZ     e) f(w,x, ¥,z) = wy + wxz  + xyz
                          f) f(Qv, wi. x,y, 2) = VWXYZ+ vwXZ
                                                         + UXVZ + WZ + UWy + Vz
                      11. a})2     b>3      cf 4   ak 1
                      13. a) |f 'O|=|f MI =8 — b) | f7'O)| = 12, |f -'d)| =4
                                e) |f-'O)| = 14, |f-'G)|=2 — d) |f-'()| = 4, |f "|                                                        = 12

Section 15.3—p. 733
                            © HUE          WY     TUXZ        +UVZ         + WZ
                      —"

a) f(w,x,y,z)=z                             b) f(w,x,y, 2) =xXVT+xy2+
                                                                                            + xy
                                                                                                      xyz
                                                              Z) =   vyZ+Wxyz                 + DWE
                                Cc)   fr,       wi x,y,

.    {b, d},    {c,   d},   {d,   th,    {a,    8}.   {e,   th,    {b,   e},   {c,   e},   {a,   th.   {b,   2},    {e,   g}

Section 15.4—p. 741
                            . a)      30 ~=~b) 30~— ce):1~—s dd) 21~—s eg) 30~—séf'*):s« 70
                            .a)       w<0>w-0=w.         But w-0 =O, by part (a) of Theorem 15.3.
                                c) y<z=>yz=y,and y <Z => yz = y. Therefore, y = yz = (yzZ)z = y@z) = y -0=0.
                            ~VYS<Kx
                            . From Theorem 15.5(a), with x), x2 distinct atoms, if x;x. # 0, then x) = x)x2 = x2x) = x,
                                a contradiction.
                      11. a) f(0) = f(xx) for eachx € B). f(xx) = fx) fH) = f@)f)                                                                     = 0.
                                b)    Follows from part (a) by duality.
                                Cc) x<yeexyaxs                             fay) = fas                      fas)              = fa) =            fa) < fQ)
                      13.       a) faxy)= f+ = fF                                             =fOTIO =F@-FH =
                                f£®)- fH =f): £6)
                          b) Let 28), 2B» be Boolean algebras with f: 9%, —* %> one-to-one and onto. Then f is an
                          isomorphism if f(x) = f(x) and f(xy) = f(x) f(Q) for all x, y € B,. [Follows from part (a)
                          by duality.]
                      15. For all 1 <i <n, (x; +x. +-++ 4-4,)x, = HX, + XOX, ee tH, X, £ HX, +X 1X, +
                                 seo tx,x,        =04+04+---4+04%;                            +0+---+0                 = x,, by part
                                                                                                                                   (b) of Theorem                   15.5.
                                Consequently, it follows from Theorem                                15.7 that (x; + x2 +--+                   +.24%,)x = x forall x eR.
                                Since the one element is unique (from Exercise 10), we conclude that 1 = x; +x. +---+X,.

Supplementary
Exercises —p. 743           . a) Whenzn   = 2, x; + x» denotes the Boolean sum of x, and x». For n > 2, we define
                              Xy + XQ +++ +4 Xn + Xn41 recursively by (x; + x2 +--+ +X,) + Xn41. (A similar definition
                              can be given for the Boolean product.) For n = 2, x; + x» = X)X2 is true; this is one of the
                              DeMorgan Laws. Assume the result for n = k (> 2) and consider the case of nm =k 4+ 1.

(xp     x2 te             + Xe + Xe41) = CH +2 +                                 + Xe) + XH
                                                                                                                 = (KX) + Xo ++
                                                                                                                              + XK) XG
                                                                                                                 = Xp X20
                                                                                                                        XK XK
                              Consequently, the result follows for all n > 2, by the Principle of Mathematical Induction.
                              b) Follows from part (a) by duality.
                            . She can invite only Nettie and Cathy.
                            . [fx <zand y <z, then from Exercise 6(b) of Section 15.4 we have x + y < z +z. And by the
                              Idempotent Law we have z + z = z. Conversely, suppose that x + y < z. We find that
                              x <x + y, because x(x + y) = x + xy (by the Idempotent Law) = x (by the Absorption Law).
                              Since x <x + y andx + y <z, we have x < z, because a partial order is transitive. (The proof
                              that y < z follows in a similar way.)
S-80          Solutions

~-apx<yox4¢x<y4exe>o>l<yt+x>o>yv4+X=x4+y
                                                                = 1. Conversely,
                                X¥+ty=lox«®t+y)=x-laxx(=ODt+xy
                                                     =x >SpxyH=xoax<y.
                                b) x <V¥oxVH=x
                                            SB xy = OY)y = xy) = x -0=0. Conversely,
                                xy=O5x=x-l=x(yt+y)=H=xytxy=xyands       =xy ox <y.
                             9% a) f(w,x,y,z)=wxt+xy                   Dd) gv, w, x,y,z) = VwWyz+xz+wyzT+xyZ
                            1. a) 22°)              24; 2"!
                            13. a) If = 60, there are 12 divisors, and no Boolean algebra contains 12 elements since 12 is not
                                a power of 2.
                                b) If nm = 120, there are 16 divisors. However, ifx = 4, then xX = 30 and x - x = ged(x, X) =
                                ged(4, 30) = 2, which is not the zero element. So the Inverse Laws are not satisfied.

Chapter 16
Groups, Coding Theory, and Polya’s Method of Enumeration
Section 16.1 —p. 751
                              . a) Yes. The identity is 1 and each element is its own inverse.
                                b) No. The set is not closed under addition and there is no identity.
                                c) No. The set is not closed under addition.
                                d) Yes. The identity is 0; the inverse of 107 is 10(—n) or —10n.
                                e) Yes. The identity is 14 and the inverse of g: A> Ais g7!: A> A.
                                f) Yes. The identity is 0; the inverse of a/(2”) is (—a)/(2").
                              . Subtraction is not an associative (closed) binary operation for Z. For example, (3 — 2) — 4 =
                                 3 #5=3-(2-4).
                              . Since x, yEeZ=>x+y+1 € Z, the operation is a closed binary operation (or Z is closed
                                under 0). Forallw, x, yEeZwo(xoy)=wotetyt+)l=wt+oatydt+lI4+l=
                                (w+x+1)+y¥+1= (wox)o y,so the binary operation is associative. Furthermore,
                                xoy=x+ty4+l=y+x+1=yox, forall x, y € Z, soo is also commutative. Ifx € Z,
                                then x o (-1) = x + (~—1) + 1 = x[= (—1) ox], so —1 is the identity element for o. And
                                finally, for each x € Z, we have —x —2 € Zand xo (—x —2) =x+(-x -—2)+1=
                                —1 [= (—x — 2) 0x], so —x — 2 is the inverse forx under o. Consequently, (Z, o) is an abelian
                                group.
                              « Urg = {1, 3, 7, 9, 11, 13, 17, 19}    Ung = {1, 5, 7, 11, 13, 17, 19, 23}
                              . a) The result follows from Theorem 16.1(b) because both (a~!)~! and a are inverses of a~!. 1
                                b)
                                                         (b-'a7)(ab) = b7 (a7! a)b = bo '(e)b = bb         = e and
                                                         (ab)(b-!a~!) = a(bb")a!        = aleja“! = aa! =e

So b-'a~! is an inverse of ab, and by Theorem 16.1(b), (ah)~! = bo!a7}.
                            11. a)   {0}; {0, 6}; (0, 4, 8}; {0, 3, 6, 9}; (0, 2, 4, 6, 8, 10}; Zip
                                b) {1}; {1, 10}; {1, 3, 4,5, 9} ZI
                                C) {to}; {%o, 11, Wa}; (70. ri}; (70, Pr}: (Wo. 73}s S3
                            13. a) There are 10: five rotations through i(72°), 0 <i < 4, and five reflections about lines
                                containing a vertex and the midpoint of the opposite side.
                                b) For a regular n-gon (n > 3) there are 2” ngid motions. There are the n rotations through
                                i(360°/n),0 <i      <n    — 1. There are n reflections. For n odd, each reflection is about a line
                                through a vertex and the midpoint of the opposite side, For n even, there are n/2 reflections
                                about lines through opposite vertices and n/2 reflections about lines through the midpoints of
                                opposite sides.
                            15. Since eg = ge for all g € G, it follows thate €¢ H and H # @. If x, y ¢ H, thenxg = gx and
                                yg = gy for all g € G. Consequently, (xy)g = x(yg) = x(gy) = (xg)y = (gx)y = g(xy) for
                                all g € G, and we have xy € H. Finally, foreach x € H, g € G,xg-! = g7!x. So
                                (xg7!)7! = (g7)x)7!, or gx! = x7!g, and x7! € H. Therefore, H is a subgroup of G.
                            17. b)    (i) 216
                                                                                                             Solutions      S-81

(ii)   A, = {(x, 0, 0)|x € Ze} is a subgroup of order 6
                                          Ay = {(x, y, 0)|x, y € Ze, y = 0, 3} is a subgroup of order 12
                                          A; = {(x, y, 0) |x, y € Ze} has order 36
                                   (iii) —(2, 3, 4) = (4, 3, 2): -(4, 0, 2) = (2, 0, 4); -G, 1, 2) =       5,4)
                      19. ax=1lxe=4             bx=1,x«=10
                          c) x =x! ->x* =1 (mod p) > x? — 1 =0 (mod            p) > (& — I(x+ 1) = 0 (mod     p) >
                          x — | =0 (mod p) or x + 1 =0 (mod p) > x = 1 (mod p) or x = —1 = p — 1 (mod           p).
                          d) The result is true for p = 2, since (2 — 1)! = 1! = —1 (mod 2). For p > 3, consider the
                          elements 1,2,..., p—lin (Zi, -). The elements 2, 3,..., p — 2 yield (p — 3)/2 pairs of the
                          form x, x~!. (For example, when p = 11 we find that 2, 3, 4, ..., 9 yield the four pairs 2, 6;
                              3, 4, 5, 9: 7, 8.) Consequently, (p — 1)! = (1)(1I)" 9? (p — 1) = p —1 = —1 (mod p).

Section 16.2—p. 756
                            »b) f(a"): f@ = f@"'-a) = flec) = en and f(a): f(a!) = fla-a) = f (eg) = en,
                             so f(a~') is an inverse of f(a). By the uniqueness of inverses (Theorem 16.1b), it follows that
                             fia')=([f@r!.
                            ~fO= 0,0)      fd=da,)                      f@)= (2,9)
                             fH=O0D        fA=0,0)                      fG)=@,1)
                            » £(4, 6) = Sg)      + 32>
                       Wn

- a) 0(7t0) = 1, 601) = ear) = 3, o(71) = o(F2) = (rs) = 2
                              b) (See Fig. 16.6) 6(79) = 1, e(,) = e(13) = 4, (m2) = €(r)) = 6(r2) = (73) = O(r4) = 2
                            . a) The elements of order 10 are 4, 12, 28, and 36.
                      11,    Zs = (2) = (3):        ZF = (3) = (5);     Zh = (2) = (6) = (7) = (8)
                      13,    Let (G, +), (H, *). (K, +) be the given groups. For allx, yé G, (go f)(x +y) =
                             g(f(x + y)) = 8)            * FO) = (8     O@))) - (g(FO))) = (go F(X) - (Cg © f)Q)), since
                             f. g are homomorphisms. Hence, g o f: G > K          is a group homomorphism.
                      15. a)    (Zi2, +)     = (1) = (5) = (7) = (11)
                                (Zie, +)     = (1) = (3) = (5) = (7) = (9) = (11) = (13) = (15)
                                (Zo, +)      = (1) = (5) = (7) = (11) = (13) = (17) = (19) = (23)
                             b) Let G =      (a*). Since G = (a), we have a = (a*)’ for some s € Z. Then a!-* = e, so
                             | —ks = tn since o(a) =n. 1—ks =tn=> 1 =ks +tn => ged(k, n) = 1. Conversely, let
                             G = (a) where a‘ & G and gcd(k, n) = 1. Then (a*) C G. ged(k, n) = 13 1=ks +1, for
                             some s,t€Z=>a =a! = ak" = (a*)*(a")! = (a*)*(e)' = (a*y’ € (a*). Hence G C (a*). So
                             G = (a*), or a* generates G.
                             c) p(n).

Section 16.3—p. 758
                       lea) {(2           3 3 76         9 7 3).G     7 3 3)0         3 3 DO}
                             biG          i 4 )F={G         3 7 0         33     70      33     9.6    7 2 DI
                                 G23           DF={(0       3 4 )0        97 9673 9G                   3 3 D}
                                 (24           )47={0       3 7 6         42 2G 7 3 90                 3 3 DJ
                                 (393          J¥={G        93 1).G       7 4 )0 3 7 9.0               3 3 OD}
                                (G42           J4={0        7 3 3.63976                  3 7 9-0       4 3 9}
                                (| 3 3 jH=H4
                      3. 12
                      5. From Lagrange’s Theorem we know that |K| = 66 (= 2-3 - 11) divides || and that | H|
                         divides |G| = 660 (= 27-3-5- 11). Consequently, since K # H and H # G, it follows that
                         || is 2(2-3-11)= 132 or 8(2- 3-11)= 330.
                        .a) Lete=({             3 3 d,a=(6          7 3 9.6=()         2 3 4d ands=()             3 3 4),
$-82         Solutions

It follows from Theorem 16.3 that H is a subgroup of G. And since the entries in the
                                  accompanying table are symmetric about the diagonal from the upper left to the lower right, we
                                  have H an abelian subgroup of G.
                                  b) Since |G| = 4! = 24 and |H| = 4, there are 24/4 = 6 left cosets of H in G.
                                  c) Consider the function f: H — Zs X Z, defined by

fe) = (0, 0),        f(a) = (1, 0),         f(B) = ©, 1),     f(6) = , I.
                                  This function f is one-to-one and onto, and for all x, y € H we find that

f(xy) = fx) B f(y).
                                  Consequently, f is an isomorphism.
                                  (Note: There are other possible answers that can be given here. In fact, there are six possible
                                  isomorphisms that one can define here.)
                                . a) If H is a proper subgroup of G, then by Lagrange’s Theorem, || is 2 or p. If |H| = 2, then
                                  H = le, x} where x? = e, so H = (x). If |H| = p, lety © H, vy # e. Then c(y) = p, so
                                  H = {y).
                                  b) Let x € G, x # e. Then «(x) = p ore(x) = p?. If (x) = p, then |(x)| = p. Ife(x) = p’,
                                  then G = (x) and (x”) is a subgroup of G of order p.
                          11. b) Letx ¢ HO K. If the order ofx is r, then y must divide both m and n. Since ged(m, n) = 1,
                              it follows thatr = 1,sox =e and HN K = {e}.
                          13. a) In (Z*, -) there are p — 1 elements, so by Exercise 8, for each [x] € (Z*, -), [x]?~! = [1],
                              or x?! = 1 (mod p), or x? =x (mod p). For all a € Z, if p| a, then a = 0 (mod p) and
                              a’ =0Q=a (mod p). If p / a, thena = b (mod p) where 1 <b < p — 1, and
                                  a? = b? =b =a    (mod p).
                                  b) In the group G of units of Z,, there are ¢(n) elements. If a € Z and gcd(a, n) = 1, then
                                  [a] € G and [a]? = [1] or a?       = 1 (mod n).
                                  ¢) and d) These results follow from Exercises 6 and 8. They are special cases of Exercise 8.

Section 16.4—p. 761
                                . 0462 O170  1809         0462    1809   1981   0305
                          —

. DRIVESAFELYX              §. p = 157, ¢ = 773

Section 16.5—p. 765
                                . a)   e= 0001001      ~=b) r=1111011 = ¢) c = 0101000
                          bod

» a)    (i) D(111101100) = 101       qi) D(000100011) = 000
                                     (iii) D(O1O011111) = 011
                                  b) 000000000, 000000001, 100000000            sc) -64
Sections 16.6 and 16.7-
p. 772                          ~ S(101010, 1) = {101010, 001010, 111010, 100010, 101110, 101000, 101011}
                                  SCULI111, 1) = {111111, OVLI11, 101111, 110111, 111011, 111101, 111110}
                                ~ a) |S¢x, 1)| = 11; |S, 2)| = 56; |S(x, 3)| = 176
                                  b) S@,H)=14+() +6) +--+) = Uh OF)
                                . a) The minimum distance between code words is 3. The code can detect all errors of weight <
                                  2 or correct all single errors.
                                  b) The minimum distance between code words is 5. The code can detect all errors of weight <
                                  4 or correct all errors of weight < 2.
                                                                                                                                                                                        Solutions                       5-83

c) The minimum distance is 2. The code detects all single errors but has no correction
                              capability.
                           7. a) C = {00000, 10110, 01011, 11101}. The minimum distance between code words is 3, so the
                              code can detect all errors of weight < 2 or correct all single errors.
                                           1    0   1  0     0
                              bo H=;1            1  0   1    =0
                                           0     10    0 =1
                              c) (i) Ol        (ii) 11      (v) 11     (vi) 10
                              For (iii) and (iv) the syndrome is (111)", which is not a column of H. Assuming a double error,
                              if (111)" = (110)" + (001)", then the decoded received word is 01 [for (iii)] and 10 [for (iv)]. If
                              (111)" = (O11)" + (100)", we get 10 [for (iii)] and 01 [for (iv)].
                           9. G = [/g|A] where /, is the 8 X 8 multiplicative identity matrix and A is a column of eight 1’s,
                              H = [A™|1] = [11111111]1].
                          11. Compare the generator (parity-check) matrix in Exercise 9 with the parity-check (generator)
                              matrix in Exercise 10.

Sections 16.8 and 16.9-
p. 779                     1, (75°); 255
                           3. a)     Syndrome      Coset Leader
                                         000            00000       10110    =O1011l  =11101
                                          110           10000       00110     11011  01101
                                         O11            01000       11110    QOO11   10101
                                          100           00100       10010    Ol1111  11001
                                          010           00010       10100    01001   11111
                                          001           00001       10111    O1010   11100
                                          101           11000       01110     10011  00101
                                          111           01100       11010    OO11L = 10001
                                (The last two rows are not unique.)
                                b)   Received Word        Code Word     Decoded Message
                                            11110                                    10110                                     10
                                            11101                                    11101                                     11
                                            11011                                    01011                                     01
                                            10100                                    10110                                     10
                                            10011                                    01011                                     01
                                            10101                                    11101                                         1]
                                            11111                                    11101                                     11
                                            01100                                    00000                                     00
                            . a) Gis57
                                   X 63; His6 X63                                               b) The rate is 2.
                            » a) (0.99)’ + (7)(0.99)°(0.01)                                     —_-b) [(0.99)7 + (7)(0.99)°(0.01)P

Section 16.10—p. 784
                                a) a*=          Ci          Cy        C3        Cy        Cy         Cy        Cr        Cg        Cy         Cro         Cu        Cire          Cis         Cia         Cis         Cie
                            "         2         Cy          Cy        Cs        Cz        Cz         Cg        Co        Co        Cp         Cro         Cu        Cia           Cis         Cir         C13         Cie
                                     pe         {Ol         Cp        C3        Cy        Cs         Co        C7        Cg         Co        Cro         Cu        Ciz           C13         Cia         Cis         Cie
                                      4         C;          Cy        C3        Co        Cs         Co        Cg        Cz         Co        Cro         Cu        Ciz2          Cis         Cia         C13         Cis
                                b) (w)*=                   Cy         Cy        C3        Ca        Cys        Co        Cr        Cg         Co     Cro        Cn           Cre          C13        Cia         Cis         Cre
                                                           Cy)        Cs        Cy        Cz        Cy         Co        Co        C7         Cg     Cu         Cio          Cis          Ci2        Cis          Cia        Cie

= (at)
                                ¢)   merk   =         C;         Co        C3        Cy        C5         Co        C7        Cg         Co         Cio        Cu          Cir2         Ci3         Cia         Cis ci)
                                      a4              Ci         Cs        Cy        C3        Cy         Cy        Co        Cg         Cp         Cu         Cw          C3           Cr          Cis         Cra         Cio
                                            = (m3r4)*
S-84          Solutions
+

. a) ofa) =7;       e(6)=12;     efy) =3:     6) =6
                            b) Leta € S,, with a = c)c) +--+ cy, a product of disjoint cycles. Then «(@) is the lcm of
                            €(c)), £(c2),..., £(c,), where £(c,) = length of c,, for l <i <k.
                          . a) 8 ~~ b) 39        7. a) 70      b) 55
                          . Triangular figure: a) 8     b) 8      Square figure:a) 12      b) 12
                          . a) 140      b) 102       13. 315

Section 16.11 —p. 788
                          . a) 165                   ~b) 120
                          . Triangular figure: a) 96     _b) 80
                            Square figure: a) 280    —_b) 220
                            Hexagonal figure: a) 131,584        —_b) 70,144
                          - a) 2635      ~—b) :=1505
                             Cc)                 R                    R

B                 Y        Y         B

G                     Ww   W         G

R                    R
                          - a) 21    ~b) 954
                            c) No: k = 21 andm                       = 21,sokm         = 441 # 954 = n. Here the location ofa certain edge must
                                                                                                                                        R    W       W
                                                                                                                                            eo
                             be considered relative to the location of the vertices. For example,                                       |            W is not equivalent to
                                                                                                                                               —_~
                                                                                                                                        Ww WB
                               R         W   W                        R       Ww                       R         Ww           Ww                                     w
                                    o—_*

Wj                  R even though                     is equivalent to                   and R                 W is equivalent to W |         R,
                             «_                                                                                                    --                              e—-
                               Ww        W   B                       Ww        B                      Ww          B           Ww                                      Ww
Section 16.12—p. 793
                           . a)      (i) and (ii) r4 + wt + row + 2r?ew? +rw?
                             b) (i) C/H[r +b4+w)* +2074 4+ 644 0 4+ (7? 4+        + w’)?]
                                (ii) (1/83)[@ +6 +w)t + 2(r4 + bt + wt) 4+ 30? +b? 4+ wy?
                                                                          +20 +b+w)(r? +b? 4+w’)]
                           . a)      10
                             b) (1/24)[(r + w)® + 60 + wr? + wt) + 307 + we)? (r? + w*)? + 607? + Ww?)
                                                                    + 8(7? + w)?]
                             c)      2
                           . Let g = green and y = gold.
                             Triangular figure: (1/6)[(g + y)* + 2(g + y)(g? +9) + 3g + P(g? + YD]
                             Square figure: (1/8)[(g + y)° + 2(g + y)(g* + y*) + 3(g¢ + (ge? +’)?
                                                            + 2(g + y)?(g* + y*)]
                             Hexagonal figure: (1/4)[(g + y)” + 2(g + y)(g? + y)* + (g + vy) (g? + 07]
                           . a) 136                  +b) (1/2     + (7? + w’)4]
                                                                )[(r
                                                            + w)8                                          se) 38: 16         9, (m+n)

Supplementary
Exercises—p. 797          1. a) Since f(ec) = ey, it follows that e; €¢ K and K #4. Ifx, y € K, then f(x) = f(y) = en
                             and f(xy) = f(x) f(y) = even = en, soxy € K. Also, forx € K, f(x7') =[f()]"! =
                             e;;' =ey,80x~'                     € K. Hence K is subgroup of G.
                             b) Ifx ¢ K, then f(x) = ey. Forallg €G,

flexg') = figfaf(g') = f(genflg') = f(g fle”) = flee) = flec) = en.
                             Hence, for allx € K, g € G, we find that gxg!                                 EK.
                           . Leta, be G. Thena’h* = ee = e = (ab)? = abab. But a*h? = abab > aabb = abab>
                             ab = ba, so G is abelian.
                           . Let G = (g) and leth = f(g). If h, € H, then h, = f(g") for some n                                               € Z, since f is onto and
                             G is cyclic. Therefore, h; = f(g") =[f(g))" = h", and H = (h).
                                                                                                                     Solutions    $-85

7.    For alla,
                                            be G,
                                                               (aoca)ob!ob=bob!olfa!oays

aca        'ob=boa'!oasacbh=boa,

and so it follows that (G, 0) is an abelian group.
                                  . a) Consider a permutation o that is counted in P(n + 1, k). If (2 + 1) is a cycle (of length 1)
                                    in o, then o (restricted to {1, 2, 3, .... }) is counted in P(n, k — 1). Otherwise, consider each
                                    permutation t that is counted in P(n, k). For each cycle of t, say (a, -- - a,), there are r
                                    locations in which to place n + 1—(1) between a, and a; (2) between a2 and a3; ...; (r — 1)
                                    between a,_, and a,; and (r) between a, and a,. Hence there are n locations, in total, to locate
                                    n+ lint. Consequently, P(n + 1,k) = P(n,k —1)4+nP(n, k).
                                    b) >i , P(n, k) counts all of the permutations in S,, which has n! elements.
                               11. a) Suppose that 7 is composite. We consider two cases.
                                        (1) n=m-r,wherel         <m <r <n: Here (n           —1)!=1-2---(m—1)-m-(m4+1)---
                                             (ry —1l)-r-(r +1)---(— 1) = 0 (mod          nz). Hence (n — 1)! # —1 (modn).
                                          (2) n = q?, whereq is a prime: If (n — 1)! = —1 (modn) then 0            =g(n — 1)! =q(-D=
                                              n — q £0 (mod n). So in this case we also have (n — 1)! # —1 (mod n).
                                     b)   From Wilson’s Theorem, when p is an odd prime, we find that

~l=(p—D!=(p—3)"p —2)(p-— 1) = (p —3)(p? — 3p + 2) = 2(p — 3)! (mod p).

Chapter 17
           Finite Fields and Combinatorial Designs
Section 17.1—p. 806
                                    ~ fix) + a(x) = 2x4 45x73 44°4+5
                                      f (x) g(x) = 6x7 + 2x84 3x9 44x44 2x7 +4? 44x44
                                      (10)(11)?; (10)(11)*; (10) (11)*; (10)(11)”
                                      a)and b) f(x) = (x? +4)(x — 2)(x + 2); the roots are + 2.
                                     c)   f(x) = (x +21)(x — 21)(x — 2)(x 4+ 2); the roots are + 2, + 2/.
                                     d) (a) f(x) = (x* — 5)(x? +5); there are no rational roots.
                                          (b) f(x) = (x — V5)(x + V5)(x* +5); the roots are + JS.
                                         (c) f(x) = (x — V5) + V5)(x — V5i)(x + VSI); the roots are + /5, +i JS.
                                    . a) f(3)= 8060      b) f=1l     oc) f(-9) = f2) =6
                               11. 4,6; p—1
                               13. Let f(x) = 30", 4@,x' and h(x) = 0 *_, b)x', where a, € R forO <i <™m, b, € R for
                                   0<i<k,andm <k. Then f(x) +h(x) =             Do*_(a, +.b))x', where dys) = Gmy2 = ++ =
                                     a, = z, the zero of R, so G(f (x) +A(x)) = G (Kola, + bx") = Vg lai +b)x! =
                                        *_glg(a;) + a(b x! = Shy glax' + Vihy 8)! = GF) + G(A)). Also,
                                      f(x)h(x) =      an c;x', where c, = a,bo +.a,-1b; +--+ +a ,b,-) + agb,, and
                                                                                    m+k            m-+k

G(f(x)A(x)) = G (>         ox   —     > g(c, )x'.
                                                                                    i=0            1=0

Since g(c,) = g(a,)g(bo) + g(@-i)g(>) +> ++ + garg (b,-1) + 2 (ao) g(b;), it follows that
                                                    m+k               m              k

dD atc)x' = (x coos’         (>: cero’ = G( f(x)               G(h(x)).
                                                     1=0             1=0            1=0

Consequently, G: R[x] > S[x] is a ring homomorphism.
                               15. In Za[x], (2x + 1)(2x + 1) = 1, so (2x + 1) is a unit. This does not contradict Exercise 14
                                   because (Z4, +, +) is not an integral domain.
                               17. First note that for f (x) = a,x" + a,_)x"~| + +++ + anx? +a1x +49, we have dy, + dy-1
                                      +-->+a)      +a, +d   = Oif and only if f(1) = 0. Since the zero polynomial is in S, the set S$ is
S-86          Solutions

not empty. With f(x) as given here, let g(x) = b,x” + by, oyx™     | +e       ++ box? + bx +
                                 bo € S. (Here m <n, and form <n we have by4) = bay2 = +-- = b, = 0.) Then
                                 fd) ~— g) =0-0=0, so f(x) — g(x)eS.
                                    Now consider h(x) = )°*_, r;x' € F[x]. Here h(x)
                                                                                  f (x) € F[x] andh(1) f() =
                                 h(1)-0=0,       so h(x) f(x)   € S.
                                       Consequently, S is an ideal in F [x].

Section 17.2—p. 813
                             . a) x° + 3x — 1 is irreducible over Q. Over R, C,

x? 43x—1= [x —((-3 + V13)/2)]Lx — (—3 — V13)/2)].
                                 b) x* — 2    is irreducible over Q.
                                       Over R, x4 —2 = (x — ¥2)(x + V2)(x? + V2);
                                       x4 25 (x — V2)x + V2)(a — V2i)(@ + Y2i) over C.
                                 ce)   x7 4+x4+1 = (« + 2)(x +2) over Z3. Over Zs, x7 + x + 1 is irreducible; x? + x +1         =
                                 (x    + 5)(« + 3) over Z,.
                                 d)    x*+.x° + 1 is irreducible over Z>.
                                 e)    x° + 3x* — x +1 is irreducible over Zs.
                             . Degree 1:x;x+1           Degree 2:x*+x+1        Degree 3:34 x7 +1; 93 +241                 5. 7°
                          Land

. a) Yes, since the coefficients of the polynomials are from a field.
                               b) h(x) |f(@), gx) > f(x) = ACv)u(a), g(x) = h(x)v(x), for some u(x), v(x) € F[x].
                                  m(x) = s(x) f(x) + t(x) g(x) for some s(x), f(x) € F[x], so
                                  m(x) = A(x) [s(x)u(x) + t(x)v(x)] and h(x)     | (x).
                                 c)    If m(x) J f(x), then f(x) = g(x)m(x) + r(x), where 0 < deg     r(x) < deg m(x).
                                       m(x) = s(x) f(x) + t(x)g(x) sore) = f(x) — gis) fOr) + t(x)gix)]
                                            = (1 —qQx)sQyf) — g@)t@)gx), sore S.
                                  With deg r(x) < deg m(x) we contradict the choice of m(x). Hence r(x) = 0 and m(x)| f(x).
                             . a) The ged is (x— 1) = (1/17) (0° — x4 4+ x3 +x? —x - 1)
                                                         — (1/17) (x? + x — 2)(x3 — 2x? + Sx — 8).
                                 b) The gcd is 1 = (4+ DO4742°4+1D4 07427? 4a)? +241).
                                 c) The gcd is x? + 2x +1 = (x4 + 2x7 42% +2) + (x + 2)(2x7 + 2x? + % +1).
                          11. a=0,b=0;a=0,b=1
                          13. a) f(x) = fi(x) (mod s(x) > f(x) = fi(x) +A(x)s(x), for some h(x) € F [x], and
                              g(x) = gi (x) (mod s(x)) => g(x) = gi (x) + k(x) s(x), for some k(x) € F[x]. Hence f(x) +
                                 B(x) = fie) + Bi) + AQ) + k(x) 5(X), So F(X) + 80x) = fi) + 81 (0) (mod s(x)), and
                                 fingix) = fia) + CA )K (x) + gi (h(x) + hk (x)s(x)) s(x), so f(x) gQ) =
                                 fix)gi(x) (mod s(x)).
                                 b) These properties follow from the corresponding properties for F [x]. For example, for the
                                 distributive law,

[Fg]           + FAQ) = [LF@)][g@) +40)] = (FO) (g@) +h@))]
                                                                       = [f@)8@) + fA) = [Fg]                  + [F@)A)]
                                                                       = [fx] [g@)1 + (FQ) TA@)I.
                                 d) A nonzero element of F[x]/(s(x)) has the form [f(x)], where f(x) # 0 and deg f(x) <
                                 deg s(x). With f(x), s(x) relatively prime, there exist r(x), ¢(x) with | = f(x)r(x) + s(x)t(x),
                                 so 1 = f(x)r(x) (mod s(x)) or [1] = [f@)][7@)]. Hence [r@v)] = [f@)I'.
                                 e) gq”
                          15. a) [2x +1]      b) [2x41]        oc) [2x]     17. a) p"” ~~ »b) o(p*-        I
                          19. a) 6     b) 12     ec) 12~   dj) icm(m,n)      ee) O
                          21. 101, 103, 107, 109, 113, 121, 125, 127, 128, 131, 137, 139, 149
                                                                                                            Solutions    S-87

23. For s(x) = x? + x? + x +2 € Z;[x] one finds that s(0) = 2, s(1) = 2, and s(2) = 1. It then
                          follows from part (b) of Theorem 17.7 and parts (b) and (c) of Theorem 17.11 that Z3[x]/(s(x))
                            is a finite field with 3° = 27 elements.
                      25.   a) Since 0 = 0+ 0V2 € Q[ V2], the set Q[2] is nonempty. For a + bV/2, c +dV2€ Q[V2],
                            we have

(a + b/2) — (c+ dV2) = (a—c) + (b— a) V2, with (a —c), (b —d) €Q; and
                                     (a + bV2)(c + dV/2) = (ac + 2bd) + (ad + be)V2, with ac + 2bd, ad + be EQ.
                            Consequently, it follows from part (a) of Theorem 14.10 that Q[V2] is a subring of R.
                            b) To show that Q[./2] is a subfield of R we need to find in Q[./2] a multiplicative inverse for
                            each nonzero element in Q[/2]. Leta + bV/2 € Q[ V2] witha + bV/2 # 0. Ifb = 0, thena # 0
                            anda! € Q— anda! +0: /2€ Q[ V2]. For b # 0, we need to find c + dV2 € Q[V2]
                            so that

(a + bV2)(c + dV2) = 1.
                            Now (a + bV2)(c + dV2) = 1 = (ac + 2bd) + (ad + be) V2 = 1 => ac + 2bd = 1 and
                            ad + bc = 0     ¢ = ~—ad/b and a(—ad/b) + 2bd = 1 => -a°’d +2b’d =b>d=
                            b/(2b? — a?) and ¢ = —a/(2b? — a’). (Note: 2b* — a? # 0 because V2 is irrational.)
                            Consequently, (a + 6V2)7! = [-a/(2b? — a?)] + [b/ (2b? — a2)
                                                                                       | V2, with [—a/(2b? — a”)],
                            [b/ (2b? — a*)] € Q. So Q[ V2] is a subfield of R.
                                   Since s(x) = x* — 2 is irreducible over Q, we know from part (b) of Theorem 17.11 that
                            Qix]/(? — 2) is a field. Define the correspondence

f: QLxI/(? — 2)        QI2],     by      fla + bx]) =a + bv2.
                            By an argument similar to the one given in Example 17.10 and part (a) of Exercise 24 it follows
                            that f is an isomorphism.

Section 17.3—p. 819
                            a)1      23  4         b)1       2 3 4       ec)     1 3 4     2
                                   2 1 4 3               3.4    12              42    1    3
                                   4321                  21    4 3              3 12       4
                                   3 4 1 2               432       1            243         1
                            a =a
                               oa fift+hahht+hoeh=fhai=i
                            L3:3     4 5 1 2 3              Ly    5 1 2 3 4
                                     23    45     1               4512        3
                                     5 123       4                3 45     1 2
                                     3 45     12                  23   45      1
                                        23    45                   123    4 5
                            In standard form the Latin squares L,, 1 <i <4, become
                            Li:        12  3 4 °5           Ly:   12   3 4 5
                                      23   45    1                3 45     12
                                      3 4 5 12                    5 123       4
                                      4 5 12     3                2345         1
                                      5 123      4                45    12    3
                            Ly         123     4 5          Li:    123     4 °5
                                      45    12   3                5 12     3 4
                                      23   45    1                45    12    3
                                      5 123      4                3 45     1 2
                                      3 4 5 12                    23   45      1
                       7.   Introduce a third factor, such as four types of transmission fluid or four types of tires.
5-88          Solutions

Section 17.4—p. 824
                                                            Number            Number         Number of Points            Number of Lines
                                         Field              of Points         of Lines           on a Line                  on a Point

GF(5)                   25              30                      5                         6
                                         GF (3°)                 81              90                      9                       10
                                         GF(7)                  49               56                      7                         8
                                         GF (2*)               256              272                    16                        17
                                         GF(3l)                961              992                    31                        32

3. There are nine points and twelve lines. These lines fall into four parallel classes.
                                   (i) Slope of 0: y = 0;y = 1;  y =2
                                           (ii)     Infinite slope: x = 0;x           =1;*   =2
                                          (iii) Slope l: y=x;y=xt+ly=x42
                                          (iv) Slope 2 (as shown in the figure): (1) y = 2x (2) y =2x4+1(3) y =2x +2

(0, 0)          (1, 0)

The Latin square corresponding to the fourth parallel class is

3 1         2
                                                                                                 23           1
                                                                                                 1 2         3

~a) y=4r41             dD) y=3x4100r2x+3y+3=0
                               c) y = 10x or 10y = 11x
                             . a) Vertical line: x = c. The line y = mx + b intersects this vertical line at the unique point
                               (c, mc + b). As b takes on the values of F, there are no two column entries (on the line x = c)
                               that are the same.
                                   Horizontal line: y = c. The line y = mx + 6 intersects this horizontal line at the unique
                               point (m~'(c — b), c). As b takes on the values of F, no two row entries (on the line y = c) are
                               the same.

Section 17.5-p. 829

12            3   4           13     5    7               23    67
                                                                                                                  245        7         3   4   5   6
                                                       —_

~]
                                                                                   —

_

- a) No    b) No
                             ~ a) AQ — 1) = rk — 1) = 2r 3 Av — 1) is even.
                                  Av(v — 1) = ortk — 1) = bk(k — 1) = &(3) (2) > 6Av(v — 1)
                                A=1)b)    6fAv(v — 1) > 6/v(v — 1) 3 3) v(v — _1) > 3 |v or 3] (v — 1)
                                A(v — 1) even = (v — 1) even => v odd
                                3)v > v = 3t, t odd > v = 3(2s + 1) = 65 +3 and v = 3 (mod 6)
                                3\v —1) > v-—1=31r,teven>v-~—1=6x            > v = 6x4 1 andv =1 (mod 6)
                             ~v=9,r=4          ll. a) b=21     by) r=7
                                                                                                                                                            Solutions   S-89

13. There are A blocks that contain both x and y. And since r is the replication number of the
                                   design, it follows that r — A blocks contain x, but not y. Likewise there are r — A blocks
                                   containing y but not x. Consequently, the number of blocks in the design that contain x or y is
                                   (yr -—A)+(r-A)+A=2r—-d.
                               15. a) 31      b)8
                               17. a) v=b=3l;r=k=6. xr                    b) v=b=S57;r=k=8,A=1
                                   e) v= b=T3,r=k=9A=

Supplementary
Exercises —p. 832                    ~n=9        3. a) 31.)    =b) 30sec) 29°     de) K = 1000
                                     . Foralla € Z,, a? =a [See part (a) of Exercise 13 at the end of Section 16,.3.], so a is a root of
                                       x? —x, and x —a is a factor of x? — x. Since (Z,, +, +) isa field, the polynomial x” — x can
                                       have at most p roots. Therefore x? — x = [Luez, (x — a).
                                     . {1, 2, 4}, {2, 3, 5}, {4, 5, 7}     9. a) 9    Db) 91
                                     . b) A- J, isav X b matrix whose (i, /)th entry is 7, since there are r 1’s in each row of A and
                                       every entry in J, is 1. Hence A- J, = rJ,x,. Likewise, J, + A is av X b matrix whose (i, /)th
                                       entry is k, because there are k 1’s in each column of A and every entry in J, is 1. Hence
                                       Jy        A=ke dyxp.
                                      c) The (i, /)th entry in A - A" is obtained from the componentwise multiplication of rows i
                                      and j of A. Ifi = /, this results in the number of 1’s in row i, whichis r. Fori # /, the number
                                      of 1's is the number of times x, and x, appear in the same block — which is given by 4. Hence
                                      A-AT =(r-A)L, +A.
                                      d)|  r   A   A   A                                                         r
                                                   x               r              xr      Xr                     Xr
                                                   x               Xr             r       Xr                     Xr
                                                   x               x               xr     r                      x

A               x              Xr      AK       tee

r               aA-r            h-r          r-r              «ss          KF
                                                             Xn         r-z               0              0                            0
                                            ()          Xr                    0         r-i2z            0            ae              0
                                                        x                     0           0          r-i                  -           0

x                     0           0              0            ee           po

rt+t(—-la                              0             0                 0                       0
                                                                         xr              r-A                 0                 0                       0
                                            @)                           xr                    0         r-—iA                 0                       0
                                                                        A                      0             0            r-2z                         0

Xr                     0             0                 0          ree       Fo

=[r+@—DaAlr—aytba=@—-ay ltr                      -—D) = rk — a)!
                                       Key:     (1) Multiply column 1 by —1 and add it to the other v — | columns.
                                                (2) Add rows 2 through v to row 1.

Appendix 1
            Exponential and Logarithmic Functions
p. A-9
                                                                                                                                                    3y7/4
                                      a)         /xy3         _—   xl/2y3/2               b)       VRAxSy3            =       3x   5/4 y3/4     _      ar
                                1.

10x?
                                       c) 58x29 y-F = 5(8'3
                                                        x9 y5/3) = 5(2x3                                                       p58) = yrs
$-90            Solutions

. a) 625       ~b) 1/343)       10
                                        . a) log, 128=7         b) log;,,;5=1/3       ~—¢) log,, 1/10,000=—4           d) log, b=a

we ~I
                                     in
                                        -a) 3c)          3
                                        . a) Proof (By Mathematical Induction):
                                          For n = | the statement is log, r! = 1 - log, r, so the result is true for this first case. Assuming
                                          the result for 2 = k (> 1) we have log, r* = k log, r. Now for the case where n = k + 1 we
                                           find that log, r*+! = log,(r - r*) = log, r + log, r* [by part (1) of Theorem A1.2]
                                           = log, r + k log, r (by the induction hypothesis) = (1 + k)log, r = (k + I)log, r. Therefore,
                                           the result follows for all n € Z* by the Principle of Mathematical Induction.
                                           b) For alln € Z*, log, r-” = log, (1/r”) = log, 1 — log, r” [by part (2) of Theorem A1.2] =
                                           0 — n log, r [by part (a)] = (—n)log, r.
                                 11. a) 1.5851     _b) 0.4307   se) :—:1.4650
                                 13. a) 5/3     b) 3/2)       4
                                 15. Let x = a!%&° and y = ca’, Then

x = al ° => log, x = log, [a’°] = (log, c)(log, a),                                                                                        and
                                                                         y = cl&? => log, y = log, [c*® “] = (log, a) (log, c).
                                           Consequently, we find that log, x = log, y, from which it follows that x = y.

Appendix 2
          Matrices, Matrix Operations, and Determinants
p. A-21
                                                               3
                                   1.     a) A+B=|                                       |                                          biatarc=[3 5                                                    1
                                                                     wn

0                                                                                                                       6            4
                                                                                              |

fl
                                          c)    B+c=|          4
                                                                                 Ww

d)      a+e+o=|3                                        °            1
                                                                     An

|a
                                                                                 mH

2 24 =| —242 0 s6 |                                                                       6 24438 =| | ]
                                                                                                                                                                                                 Monee
                                                                                                                                                                            WU

_—
                                                                                                                                                                                       —
                                                                                                                                                                            ON

“a

p2c+se=|525                               20
                                                                                         95        —I15
                                                                                                          35|                       hy sc =|
                                                                                                                                                   25
                                                                                                                                                     0
                                                                                                                                                              20
                                                                                                                                                                5              10
                                                                                                                                                                             —15

i) 2B — 4c =| —182                                   —-2
                                                                                              -12
                                                                                                              -6
                                                                                                               20
                                                                                                                                    i) A+2B—3c =| —-144                                                  -8
                                                                                                                                                                                                           0
                                                                                                                                                                                                                     20
                                                                                                                                                                                                                       0

k) 2138) =| 5§ 123 244 |                                                                  ! 2-38 =| 66                            12
                                                                                                                                                                              6
                                                                                                                                                                                           24
                                                                                                                                                                                             6

. a) [12], or 12            »|               5                    |             c)          [3       “|

—5         —7         8                                          a          b         c                          a             b                c
                                           d)       29         21        2                         e)               d           e        f         f)           3g                3h              3i
                                                  —23      -~—35         6                                        3g           3h      3i                           d             e               f

. a)    (-1/5)     J                     |                   b)       I            I             c) The inverse does not exist.                                                    d)   | !        |
                                                               3             1                                           ()                                                                                                       2   $7
                                                                     2           - 1                                                           1         -2                                                      —4
                                          ay    t= ar] 5                                      |               b B= a/9)| 5                         im                         o ap=|                                            >]
                                                     1                               2             —3                            -lacl —                                2         —3
                                           d)   casy' = asioy|                   §                 a                    0       BA! =aslo|                      §                 a

Es oll] Ea]
                                                 BIB 2Ppomls sJlat-bs
                                                                                                                   Solutions   S-91

11.   det(2A) = 2°(31) = 124, det(SA) = 5°(31) = 775
                         13.   a) 45    b) -40     c¢) 14

4    -       4       = 2(-1)**!                   + 3(-1)3*?   so
                         15.   af)            |,       3       4                      -l   -l                0    -1
                                                                   = 2(-2 — (-1)) —3(-1) = 2(-1I) +3 = 1.
                                  (ii) 5           (iii) 25
                               b) (i) 51            (ii) 306        ~—(iii).:«4S10

Appendix 3
          Countable and Uncountable Sets

p. A-32
                          1. a) True      hb) False   ¢) True      d) True
                             e) False: Let A = ZU (0, 1] and B = ZU (1, 2]. Then A, B are both uncountable, but
                             AM B = Zis countable.
                             f) True
                             g) False: Let A = Z* U (0, 1] and B = (0, 1]. Then A, B are both uncountable, but
                             A — B = {2, 3, 4, ...} is countable.
                           . If B were countable, then by Theorem A3.3 it would follow that A is countable. This leads us to
                             a contradiction since we are given that A is uncountable.
                           . Since S, T are countably infinite, we know from Theorem A3.2 that we can write
                             S = {s1, $2, $3,...}and T = {t, t, t3, ...} two (infinite) sequences of distinct terms. Define
                               the function

f:SXToZ

by f(s, t,) = 2'3/, foralli,j eZ. Ifi, j,k, £6 Z* with f(s,, t,) = f (sx, te), then
                               f(S,,t)) = f (se, te) => 2'3/ = 243° > i =k, j = € (By the Fundamental Theorem of
                               Arithmetic) => s, = s, and t, = t => (s,, t,) = (sy, t/). Therefore, f is a one-to-one function
                               and $ X T ~ f(S X T) C Z*. So from Theorem A3.3 we know that S X T is countable.
                           . The function f: (Z — {O}) X Z x Z— Q given by f(a, b, c) = 2°3°5* is one-to-one (Verify
                               this!). So by Theorems A3.3 and A3.8 (Z — {0}) X Z X Z is countable. Now for all
                               (a, b, c) € (Z — {0}) X Z X Z there are at most two (distinct) real solutions for the quadratic
                               equation ax° + bx +c = 0. From Theorem A3.9 it then follows that the set of all real solutions
                               of the quadratic equations ax” + bx + c = 0, where a, b, c€ Z anda # 0, is countable.
                             Index

A, 138                                     Adjacency list, 379                         encoding function, 763, 764. 767, 769,
|A|, 124                                   Adjacency list representation, 378, 379        771, 773
A®°, A", A*, A*, 315                       Adjacency matrix (for a graph), 352, 539,   equivalent codes, 778
A~ B,A-23                                    600                                       error, 762
a=b (mod n), 686                           Adjacency of a pair of vertices, 352        error correction, 767-769
a-z cut, 645                               Adjacent from, 349, 514                     error detection, 767-769
a is congruent to b modulo n, 686          Adjacent mark ordering algorithm, 453,      error pattern, 762, 763, 771, 779
Abel, Niels Henrik, 705, 745, 794, 830        506                                      five-times repetition code, 765, 769
Abelian group, 161, 745, 746, 799          Adjacent to, 349, 514                       generator matrix, 769, 771, 772, 774,
Absolute value, 219, 224                   Adjacent vertices, 349                         q77
Absorption Laws,                           Adleman, Leonard, 759                       Gilbert bound, 773
   for a Boolean algebra, 735              Affine cipher, 691, 692, 759                Golay, Marcel, 761, 795, 796
   for Boolean functions, 713              Affine plane, 820-822, 826-828, 831         group code, 773, 774, 776, 777
   for Boolean variables, 713              Aggregate, 123                              Hamming,   Richard, 761, 766, 795, 796
   for logic, 59                           Aho, Alfred V., 378, 506, 507, 574, 575,    Hamming bound, 773
   for set theory, 139                        623, 624, 642, 667, 668, 708             Hamming code, 778
Abstract algebra, 394, 624, 742            Ahuja, Ravendra K., 562, 575, 637, 643,     Hamming matrix, 778
Access function, 254                          654, 668                                 Hamming    metric, 767
Achilles,   119                            Albert, A. Adrian, 831                      independent events, 762
Ackermann, Wilhelm, 259                    Aleph, 303                                  (m + 1, m) parity-check code, 764,
Ackermann’s function, 259                  Xo (aleph null), 303, A-30, A-31               765
Acronym, 155                               Algebra, 123, 242                           majority rule, 765
Aczel, Amir D., 706, 708                   Algebra of logic, 742                       message, 763, 769, 777, 778
Addition, 136, 137                         Algebra of propositions, 55, 57, 58; see    minimum distance between code
Addition of binary numbers, 720              also Laws of Logic                          words, 767-769, 771, 773, 774
Addition of equivalence classes            Algebra of switching circuits, 742          minimum weight of nonzero code
   of integers (in Z,,), 687               Algebra of switching functions, 711           words, 774
   of polynomials, 809                     Algebraic coding theory, 18, 761-779,       mixed strategy, 768
Addition of matrices, A-12                    795, 796                                 multiple errors, 763
Addition of polynomials, 800                  binary representations, 778, 779         nearest neighbor, 771
Additions, 636, 637                          binary symmetric channel, 762, 763        (n,m) block code, 764
Additive identity                            block code, 764                           noise, 761
   for matrices, A-13                        code word, 763, 769, 771, 772, 774,       parity-check code, 764, 765
   for real numbers, 103                        776-778                                parity-check equations, 770, 777
  for a ring, 674                            coding schemes, 763                       parity-check matrix, 772, 774,
Additive inverse                             coset leader, 775-777                        776-779
   for matrices, A-13                        d(x, y), 766                              probability, 761-765
  for integers, 278                          decoding, 763                             rate of a code, 764, 778
  for real numbers, 103                      decoding algorithm, 772                   received word, 762, 763, 777
  for a ring element, 674, 679, 680, 701     decoding by coset leaders, 776            retransmission, 765, 769
Additive Rule, 162, 168, 172                 decoding function, 764, 767               Shannon, Claude Elwood, 761, 795,
Address                                      decoding scheme, 769                         797
   class A address, 12                       decoding table, 774, 775                  sphere (S(x, k)), 767
   class B address, 12                       decoding table with syndromes, 776        S(x, k), 767
   class C address, 12                       distance, 766                             syndrome, 771, 775-777, 779
  in computer memory, 5, 694                 distance function, 766, 767               systematic form, 778
  in a universal address system, 589         dual code, 773                            transmission error, 762, 767
  internet address,     12                   efficiency of a coding scheme, 764        triangle inequality, 767
  local address, 12                          encoding, 763                             triple repetition code, 765, 768, 769
1-2             Index

weight, 766                                 Alternating sequence, 650                    Associated homogeneous relation,
  wt (x), 766                                 Alternating triple, 135                         471473, 479, 480
Algebraic expression, 590                     Alternative form of the Principle of         Associated minor, A-20
Algebraic formulae, 623                         Mathematical     Induction, 206-208,       Associated undirected graph, 350, 353,
Algebraic structures, 745, 761                  217, 238, 298, 458, 503, 582, 583             517, 645. 650
Algebraic substitution, 449                   American Journal of Mathematics, 411         Associative binary operation, 268
Algorism, 242                                 American National Standards Institute,       Associative closed binary operation, 311
Algorithm, 41, 42, 233, 242-244, 289,            125                                       Associative law
   290, 294, 295, 297, 299-301, 349, 378,     Analysis, 444                                   of addition for integers, 113
   442, 599, 605, 613, 615, 619-621,          Analysis of algorithms, 3, 247, 259, 292,       of addition for real numbers, 97
   624, 632, 633, 636-642, 649, 653              294-300, 304, 305, 453, 473, 503,            of multiplication for integers, 221
Algorithms                                       A-1,A-6                                      of multiplication for matrices, A-16
   adjacent mark ordering, 458, 506           Analytic Theory of Probability, 150, 188        of multiplication for polynomials, 801
  articulation points, 619, 620               Analytical engine, 242                       Associative laws
  biconnected components, 619, 620            Analytische Zahlentheorie, 304                  for a Boolean algebra, 736
  binary search, 501-503                      Ancestor, 588, 616-619                          for Boolean functions, 713
  breadth-first search, 598, 599              And, 48, 50                                     for Boolean variables, 713
  bubble sort, 450                            AND gate, 149, 719, 720                         for logic, 58
  decoding, 772                               Annals of Mathematics, 706                      for a ring, 673, 746
  depth-first search, 597, 598, 617           ANSI FORTRAN, 125                               for set theory, 139
  Dijkstra’s shortest-path, 633, 634, 667,    Antichain, 381                               Associative property
     668                                      Antisymmeitric property (of a relation),       for composition of relations, 345
  divide-and-conquer, 496-503                    340, 341, 347, 348, 353, 357, 358,          for function composition, 281, 282,
  Edmonds-Karp algorithm, 653-657,               376, 377                                       345, 750
     663                                      Anton, Howard, A-21                            in a group, 745, 794
  Euclidean algorithm for integers, 232,      AP(F), 822, 824, 826-828                     Associativity for Cartesian products, 248
     233                                      Apianus, Petrus,   188                       Atkins, Derek, 795
  Euclidean algorithm for polynomials,        Appel, Kenneth, 565, 573, 575                Atkins, Joel E., 623, 624
     808                                      Application specific integrated circuit,     Atom of a Boolean algebra, 738-740, 743
  exponentiation, 297-299                        149                                       AT&T Bell Laboratories. 188
  Fibonacci numbers, 477, 478                 Applied Boolean algebra, 742                 Augarten, Stan, 243, 244
  Ford-Fulkerson algorithm, 654-657,          Approximately equal (=), 7                   Auluck, F, C., 463, 507
     663                                      Approximation theory, 304                    Automata theory, 333
  generating permutations, 453, 506           Arbitrary, 110                               Automated reasoning, 119
  greatest common divisor, 232, 233           Arc, 321, 329, 349, 514                      Auxiliary variables, 461]
  greatest common divisor (recursive),        Argue by the converse, 74, 82, 109, 547      Average-case complexity, 295, 296
     455                                      Argue by the inverse, 75, 82, 110            Axiomatic approach to probability, 188
  Huffman tree, 613                           Argument, 47, 53, 67, 72, 74, 75, 107,       Axioms of probability, 159, 161
   Kruskal’s algorithm, 639-641                   108, 112
  Hnear search, 296, 302                      Aristotle, 117, 118, 238                     b,, the n-th Catalan number, 38, 490
  maximum value, 301                          Arithmetic expression, 460                   Baase, Sara, 305, 624, 625, 641, 642,
   merge sort, 496, 608                       Arithmetic of remainders, 234                   667, 668
   merging two sorted lists, 607              Arithmetica, 243                             Babbage, Charles, 242, 243
  minimization process for a finite state     Arithmetica Integra, 42                      Bachmann, Paul Gustav Heinrich, 304
     machine, 372-373                         Arithmeticorum Libri Duo, 244                Back edge (of a tree), 616-619, 621
  nonisomorphic trees on 7» labeled           Arrangement, 6-10, 15-18, 24, 26-28,         Backtrack(ing), 331, 593, 596-598, 600,
     vertices, 586, 587                          34, 36-39, 41, 149, 155, 160, 266,           616, 653, 656
  polynomial evaluation, 301                     310, 395, 402, 406, 411, 436, 437, 439,   Backward edge, 650, 651, 654
  Prim’s algorithm, 641-643, 668                 462, 463, 524, 525, 559; see also         Balanced complete binary tree, 605, 606
  Priifer code for a labeled tree, 586, 587      Permutation                               Balanced incomplete block design, 825,
  searching an array, 295, 296                Arrangements with forbidden positions,          826
  topological sorting algorithm, 360, 361        406-410                                   Balanced (rooted) tree, 601, 602
  universal address system, 589               Arrangements (with repetition), 7, 26, 27    Ball, M. O., 562, 575, 576
Algorithmic manner, 631                       Array, 91, 450, 501-503                      Ballot Problem, 45
Alfa, 226                                     Ars Conjectandi, 41                          Bare roundhouse, 192
Al-jabr, 242                                  Articulation point, 615-621, 624             Barnette, David, 575
Alkane, 584                                   Articulation point algorithm, 619, 620       Barnier, William J., 333, 334
Al-Khowéarizmi, Abu Ja’ far Mohammed          Ascending order, 450, 606                    Barr, Thomas H., 693, 708. 795, 796
   ibn Misa, 242                              Ascent (in a permutation), 220               Barwise, Jon, 119, 120
Allowable choices, 87                         Aschbacher, Michael, 795                     Base (for a number system), 225
a, 457, 458, 469                              ASIC, 149                                    Base   (for a recursive definition), 211-213
Alpha testing, 185                            Assembly language, 226                       Base   (for exponentiation), A-1
Alphabet, 18, 309-311, 313, 315, 316,         Assignment problem, 659, 668                 Base   2, 225, 227, 608
   337, 338, 609, 610                         Assmus, E. F,, Jr., 796                      Base   8, 225
Alphabetical ordering, 589                    Associated directed graph, 350               Base   10, 225, 226
                                                                                                                 Index          1-3

Base 16, 226, 227                           Binary string, 128, 129, 188                   DeMorgan’s laws, 713
Base-changing formula for logarithms,       Binary symmetric channel, 762, 763: see        disjunctive normal form, 715
   A-7                                         also Algebraic coding theory                distributive laws, 713
Base step, 316                              Binary tree, 488, 595, 600                     d.n.f., 715-718, 721-724
Basic connectives, 47-53, 56                Binet, Jacques Philippe Marie, 457             dominance laws, 713
   and (conjunction), 48, 50                Binet form, 457                                don’t care conditions, 73 1-733
   but, 50                                  Binomial coefficient, 22, 23, 42, 133, 217     equality, 712
   exclusive or, 48                         Binomial distribution, 179                     exclusive or, 719, 720
   if...then (implication), 48              Binomial expansion, 30; see also               F,,719
   if and only if (bicondition), 48            Binomial theorem                            fundamental conjunction, 715-718,
   iff, 48                                  Binomial random variable, 179, 180,               721, 732, 738
   inclusive or (disjunction), 48               182, 183, 430                              fundamental disjunction, 717, 718
   nand, 56                                 Binomial theorem, 21-23, 42, 106, 130,         idempotent laws, 713
   negation (not), 48                           180, 188, 390, 421, 422, 436, 443          identity laws, 713
   nor, 56                                  Binomial theorem (generalized), 422,           incompletely specified, 731, 732
   or (disjunction), 48                        443                                         inverse laws, 713
Basis, 447                                  Bipartite graph, 541, 542, 558, 659, 660,      Karnaugh map, 722-727
Basis step, 195-197, 199, 201, 202,            662-665                                     law of the double complement, 713
   204-208, 212-214, 218, 317               Birkhoff, Garrett, 377                         literal, 715, 716, 722-726
Bayes, Thomas, 188, 189                     Birkhoff-von Neumann theorem, 670              maxterm, 717, 718, 727
Bayes’ Theorem, 170, 173, 188               Bit(s), 5, 225, 324, 610, 720, 742             minimal product of sums
Bayes’ Theorem (Extended Version), 173      Blank (space), 311                                representation, 727
Beckenbach,    Edwin   F., 796              Bletchley Park, 333                            minimal sum of products
Bell, Eric Temple, 508                      Blocher, Heidi, 708                                representation, 721, 722, 724, 725,
Bell numbers, 508                           Block                                             729-733
Bellman, R., 562, 575                          in a design, 825-827                        minterm, 716, 717, 732, 738
Bellmore, M., 562, 574, 575                    of a partition, 366                         product, 712
Berge, Claude, 573, 668                     Block code, 764; see also Algebraic            product of maxterms, 717, 718
Berger, Thomas    R., 707, 708, 831, 832       coding theory                               Quine-McCluskey       method, 727, 742
Bernays, Paul, 119                          Block designs, 825-829, 832                    row number, 716
Bernoulli, Jakob, 41, 42                    Bonaccio, 442                                   self-dual, 744
Bernoulli, Johann, 302                      Bond, James, 150                                sum, 712
Bernoulli trial, 161, 178, 179, 182, 430    Bondy, J. A., 573, 575, 668                    sum of minterms, 717
Bertrand, Joseph Louis Frangois, 45         Bonferroni's Inequality,   191                  symmetric, 744
Best-case complexity, 295                   The Book of Creation (Sefer Yetzirah), 41    Boolean multiplication, 711
B (blank, space), 311                       Boole, George, 118, 119, 186, 188, 377,      Boolean ring, 709
B [= (1 — ¥5)/2), 457                          711, 742                                  Boolean sum, 737
B(G), 564, 666                              Boolean addition, 346, 711                   Boolean variable, 712, 713, 724, 729
Biconditional, 48, 51, 52, 56, 104, 105     Boolean algebra, 711, 714, 733-743,          Booth, Taylor L., 742, 743
Biconnected component, 615, 619-621,           799, 830                                  Borchardt, Carl Wilhelm, 622, 623
  624                                         atom, 738-740, 743                         Bortivka, Otakar, 667
Biconnected component algorithm, 619,         definition, 733                            Bose, Raj Chandra, 819, 831
  620                                         dual, 735                                  Bound, 292
Biconnected graph, 615                        Hasse diagram, 736-739                     Bound variable, 88, 98
Big-Oh notation, 290, 304                     isomorphism, 737, 739, 740                 Boundary condition(s), 448
Big-Omega notation, 293, 505                  linear combination of atoms, 738           Boundary of a region, 546-549
Big-Theta notation, 294, 505                  partial order, 737, 738                    Bounded above, 605, 608
Biggs, Norman L., 41, 42, 574, 575            principle of duality, 735                  Boyer, Carl Benjamin, 189
Bijective function, 279, 283                  properties, 735, 736                       Brahmagupta, 707
Binary   compare, 727                         representation theorem, 738, 739, 743      Braille system, 24
Binary   digits (bits), 5, 720              Boolean algebra of sets, 740, 743            Branch node, 588
Binary   heap, 637                          Boolean complement, 711                      Branches (ofa tree), 154, 249, 331, 488,
Binary   label, 532, 716-718, 742           Boolean expression, 720                         614
Binary   number system, 225, 226            Boolean function, 711-727, 729-733,          Bravo, 226
Binary numbers, 323. 770                      738, 742, 744, 796                         Breadth-first search, 598-600, 624, 653
Binary operation, 136, 193, 211, 267-269      absorption laws, 713                       Breadth-first search algorithm, 598, 599
   460, 589, 591, 673, 674, 686, 745          associative laws, 713                      Breadth-first spanning tree, 599, 656
  associative, 268                            binary label, 716-718                      Bridge, 550
  commutative, 268                            Boolean function for a prescribed          Bridges of Kénigsberg, 513, 518,
Binary relation, 250, 337; see also               table, 714, 715                          $33-535, 573
  Relation                                    e.n.f.. 717, 718                           Brookshear, J. Glenn, 333, 334
Binary representation, 229, 693, 778, 779     commutative laws, 713                      Brualdi, Richard A., 506, 507
Binary rooted tree, 589, 590, 594, 595        complement, 712                            Bubble sort, 450-452, 455, 605, 606, 609
Binary search algorithm, 501-503              conjunctive normal form, 717               Buckley, Fred, 573, 575
Binary sequence, 461, 462, 610, 611           definition, 712                            Burnside’s Theorem, 783-785, 796
1-4             Index

Busacker, Robert G., 668                   Chain (poset), 381                          Cocycle, 564
Bussey, W. H., 244, 831                    Chain (transport network), 650              Code, 129, 610
But, 50                                    Chain of subgroups,   830                   Code word, 763, 769, 771, 772, 774,
Butane, 584                                Change in state, 319                           776-778, see also Algebraic coding
Bye, 602                                   Change of base, 225-230                        theory
Byron, Augusta Ada, 242, 243               Char(R), 812                                Coding schemes, 128, 610, 763; see also
Byron, Lord, 242                           Characteristic equation, 456                  Algebraic coding theory
Byte, 5, 225                               Characteristic function, 307                Coding theory, 3, 41, 161, 324, 574, 575,
                                           Characteristic of a ring, 812                 581, 609, 745, 761, 831; see also
c (continuum), A-30, A-31                  Characteristic roots, 456, 468                Algebraic coding theory
c(e), 644                                  Characteristic sequence, 625                Coding Theory—prefix codes, 575-579
c(P, P), 646                               Charlie, 226                                Codomain, 175, 253, 279, 281, 287, 323,
C, C*, 134                                 Chartrand, Gary, 573-575                       702
C(n, r), 15, 41, 436                       Chebyshev, Pafnuty Lvovich, 188             Cohen, Daniel I. A., 42, 304, 305
C++, 4, 13, 369                            Chebyshev’s Inequality, 183, 184, 188       Collection, 123, 135, A-29
C++ compiler, 253, 369                     Chemical isomers, 622, 796; see also        Collinear, 820, 822, 827
Caesar, Gaius Julius, 690                     Isomers                                  Collision, 694, 708
Caesar cipher, 690, 696                    Chemistry, 574, 584                         Collison, Mary Joan, 244
Calculational techniques for generating    Chess, 404                                  Color-critical graph, 573, 622
   functions, 418-431                      Chessboard, 121, 208, 209, 404-409,         Coloring, 551
Calculus, 99, A-3, A-6                        458, 464, 470, 510                       Column major implementation, 259
The Calculus of Inference, Necessary       x(G), 565, 621                              Column   matrix, A-11
  and Probable,   118                      Child, 588, 590, 594, 598, 617-620          Column   vector, A-11
Call, Gregory S., 304, 305                 Children, 589, 594, 595, 607, 613,          Comb graph, 577
Cambridge University, 705                     617-621                                  Combinational circuit, 309
Campbell, Douglas M., 507                  Chinese Remainder Theorem, 702-704,         Combinations, 14-17, 21, 26, 41, 42,
Cancel, 221                                   707, 708                                   411, 436, 453, 506
Cancellation law of multiplication, 678,   Choice and Chance, 411                      Combinations with repetition, 26-29, 41
   681                                     Chromatic number, 413, 565, 615, 621        Combinatorial analysis, 796
Cancellation laws                          Chromatic polynomial, 413, 564-571,         Combinatorial approach, 132
   for a Boolean algebra, 736                 574                                      Combinatorial argument, 385
   for a group, 747                        Chu Shi-kie, 188                            Combinatorial designs, 707, 799, 815,
  of addition (in a ring), 680             Chvatal, V., 573                               820-832
Cantor, Georg, 135, 186-188, 303, 304,     Cipher machine, 333                           affine plane, 820-822, 826-828, 831
   A-28                                    Cipher shift, 690, 759                        balanced incomplete block design,
Cantor’s diagonal method, 303, A-28        Ciphertext, 690-692, 760                         825, 826
Capacity, 644                              Circuit, 516, 528, 533, 534, 551              block designs, 825-829,   832
Capacity for a vertex, 657                 Circular arrangements, 10, 395, 784           finite geometry, 799, 820, 822, 825,
Capacity of a cut, 646, 665                Circular disks, 472, 473                           830, 831
Capacity of an edge, 631, 644, 645, 650,   Circular tables, 266                           Latin squares, 799, 815-820, 822-824,
   654, 657, 661, 663                      Clairaut, Alexis, 303                              831
Carbon atom, 583, 584, 792                 Clark, Dean S., 305                            projective plane, 827, 828
Cardinal number, A-31                      Class, 123, 780, 782                           (v, b, r, k, A)-design, 825, 826, 831
Cardinality (of a set), 124, 186, A-23,    Class A address, 12                         Combinatorial identity, 30, 131, 188, 288
   A-27                                    Class B address, 12                         Combinatorial mathematics, 385, 405
Carroll, Lewis, 119                        Class C address, 12                         Combinatorial proof, 10, 33, 47, 128,
Carry, 323, 324, 720, 721                  Class representative, 687                      259, 264, 388. 390
Cartesian product, 152-154, 248, 249,      Classification schemes, 667                 Combinatorics, 123, 575, 761
   251                                     Clauses, 86                                 Common divisor, 231
Case-by-case verification, 105             Clique, 578                                 Common multiple, 236
Castle, 404                                Clique number, 578                          Common ratio, 447
Catalan, Eugéne Charles, 38, 490, 494      Closed, 136-138                             Commutative binary operation, 268, 270,
Catalan numbers, 36—39, 361, 490-493,      Closed binary operation, 136, 267, 268,        311
   506, 507, 586, 695, 696                    270, 278, 311-313, 673, 674, 686,        Commutative group, 745
Caterpillar, 627, 628                         697, 705, 711, 733, 745, 746, 800, 801   Commutative k-ary operation, 306
Cauchy, Augustin-Louis, 795, 796           Closed interval, 134                        Commutative law of addition for
Cayley, Arthur, 411, 565, 581, 622, 623,   Closed path, 351                              integers, 113
   794, 795, A-11                          Closed switch, 64, 551, 553                 Commutative law of addition for real
Ceiling function, 254, 496, 602, 623       Closed under a binary operation, 136,         numbers, 97
Cell (memory), 5                              193, 248, 356                            Commutative law of addition for a ring,
Cell (ofa partition), 366, 367, 369,       Closed walk, 515, 516, 546, 549               673
   372-375                                 Closure for a group, 745, 774, 783          Commutative law of matrix addition,
Center of a group, 751                     c.n.f,, 717, 718                              A-12
Center of a ring, 709                      Coalescing of vertices, 567, 569            Commutative law of multiplication for
Central Limit Theorem, 188                 Cobweb Theorem, 506                           real numbers, 97
                                                                                                                Index              1-5

Commutative laws                             Computer implementation, 667, 727, 742      Converse of a quantified implication,
  for a Boolean algebra, 734                 Computer network, 638                          92-94
  for Boolean functions, 713                 Computer program, 260, 309, 349, 350,       Convex polygon, 494
  for Boolean variables, 713                   597                                       Convolution (of sequences), 430, 431,
  for logic, 58                              Computer programming, 51, 574                  440, 488
  for set theory, 139                        Computer recognition of relation            Cooke, K, L., 562, 575
Commutative ring, 675, 700, 801                properties, 348                           Corleone, Don Vito, 186, 692
Commutative ring with unity, 677, 678,       Computer science, 32, 41, 51, 91, 119,      Corleone, Michael, 692
   681, 687, 743, 802, 810                      225, 244, 247, 250, 252, 253, 259,       Cormen, Thomas H., 504, 507, 624, 625,
Comparison of coefficients, 426                323, 324, 350, 377, 378, 460, 490,           638, 643, 654, 667, 668
Comparisons, 450, 452, 473, 474, 500,          574, 575, 589, 673, A-1, A-6              Corners of a Karnaugh map, 725, 726
  502, 503, 605-608, 636, 637, 641           Computer security, 222                      Corollary,   106
Compiler, 253, 290, 302, 605                 Computer simulation, 689                    Correspondence, 21, 26, 27, 30, 37, 39,
Complement (logic gate), 719                 Computer’s main memory, 5                      131, 205, 279
Complement in a Boolean algebra, 739         Concatenation of languages, 313-315         Coset, 757, 774-776, 795
Complement in a cut, 646                     Concatenation of strings, 311, 312          Coset leader, 775-777; see also
Complement    of a Boolean function, 712     Conclusion, 48, 51, 53, 67, 70, 107, 109,     Algebraic coding theory
Complement of a graph, 523, 543                 111, 112                                 Countable set, 164, 303, A-24-A-32
Complement of a set, 138, 287                Concurrent processing, 350                  Countably infinite sample space, 177,
Complement of a subgraph in a graph,         Condition, 166                                 183, 428
   586                                       Conditional probability, 166-173            Countably infinite set, 164, 428, A-25,
Complementary (v, b, r, k, A)-design,        Congruence,   377                             A-30
   833                                       Congruence modulo x, 689, 690               Counterexample, 83, 84, 89, 91, 94, 114,
Complete binary tree, 589, 595, 596, 600,    Congruence modulo p, 830                       115
   605, 606, 610, 611, 613                   Congruence modulo s(x), 808, 810, 830       Countess of Lovelace, 242, 243
Complete binary tree for a set of weights,   Congruence of triangles, 55                 Counties on a map of England, 565
  612                                        Conjugate of a complex number, 466          Counting, 3, 10
Complete bipartite graph, 541                Conjunction, 48, 53, 57, 70, 75             Counting formulas, 148
Complete directed graph, 559                 Conjunction (logic gate), 719               Coupled switches, 65
Complete graph (X,,), 352, 354, 480,         Conjunctive normal form (c.n.f.), 717,      Covalency, 825
  523, 531, 558, 569                           742                                       Covering ofa graph, 577
Complete inventory, 786, 789                 Connected components, 352, 517              Covering number (of a graph), 577
Complete m-ary tree, 600-602                 Connected graph, 351, 488, 517              Cross product, 152, 154, 248, 250, 270,
Complete matching, 660-664                   Connectives, see Basic connectives             314
Complete ternary tree, 603                   Conservation condition, 645, 651            Cryptanalysis, 333
Complex conjugates, 465                      Conservation of flow, 649                   Cryptography, 244
Complex numbers, 134, 356, 465               Constant (of a polynomial), 799             Cryptology, 693, 708, 745
Complex roots, 464-467                       Constant coefficients, 448                  Cryptosystem, 690, 693
Complexity function, 295                     Constant Boolean function, 713              Cube, 547, 548, 791
Component flag, 641                          Constant function, 261                      Cubic equation, 794
Component statement, 49                      Constant order, 293                         Cubic order, 293
Components of a graph, 352, 353, 517,        Constant polynomial, 800                    Cubic time complexity, 293
  534, 546, 549, 567, 581, 585, 615,         Constant term, 799                          Cut (in a transport network), 645-648,
  640, 646                                   Constant time complexity, 293                  652, 661, 662
Composite function, 280, 281                 Constanzia, 186                             Cut-set, 549-551, 553, 624, 645
Composite integer, 222, 230                  Construction of                             Cycle detection, 641
Composite primary key, 272                      finite fields, 799                       Cycle in a graph, 351, 488, 516, 527,
Composite relation, 344                         a Huffman tree, 613                         532, 551-553, 556, 558, 581, 639-641]
Composition of functions, 280, 282, 344,        Latin squares, 817, 818                  Cycle index, 787, 789
  A-9                                        Constructive proof, 223, 660, 665           Cycle structure representation, 786, 787,
Composition of relations, 344                Contacts, 551, 552                             789
Compositions of integers, 30-32, 130,        Contiguous, 462, 495                        Cyclic group, 753-756, 809, 812
  131, 205, 423-426, 448, 460                Continuous random variable, 175, 183        Czekanowski, Jan, 667
Compound statement, 48, 49, 52, 53, 61,      Continuous sample space, 164
  63, 71, 80                                 Continuum, A-30                             d,, 402, 403, 410
Computational complexity, 289-293,           Contradiction, 53, 58, 76, 77, 80, 115      d(a, b), 632
  503, 575                                   Contrapositive, 62, 63, 92-94, 99, 115,     d(x, y), 766; see also Algebraic coding
Computer, 290, 309, 377, 605, 623, 631,         116, 362                                    theory
  694                                        Contrapositive method of proof, 76, 114,    Dantzig, G. B., 668, 669
Computer addition of binary numbers,            115                                      Data structures, 129, 247, 348, 349, 378,
  720                                        Control circuits, 309                          487, 490, 581, 592, 598, 605, 623,
Computer algebra system, 477, 485            Convergence, 419, 429                          637, 641, 694
Computer algorithm, 242, 243, 574            Converse of a relation, 282                 Databases, 8
Computer architecture, 531                   Converse of an implication, 62, 63, 82,     Datagram, 13
Computer hardware, 326                         99                                        Date, C. J., 305
1-6             Index

Dauben, Joseph Warren, 304, 305              Descent (in a permutation), 220               Disjunction (logic gate), 719
David, Florence Nightingale, 189             Design of experiments, 815, 825, 831          Disjunctive normal form (d.n.f.), 715,
De Arte Combinatoria, 118                    Determinant, 411, 466, 467, A-17—A-21            742
DeBruijn, Nicolaas Govert, 796               Dfi(v), 616, 619-621                          Dispersion,   180
Decimal (base 10) representation, 459        Diagonal, 781                                 Distance (in a graph), 518, 626, 631
Decision structure, 51                       Dick, Auguste, 707, 708                       Distance function, 766, 767; see also
Decison tree, 602, 603                       Dickson, Leonard Eugene, 243, 244               Algebraic coding theory
Declarative sentence, 47, 86                 Dictionary order, 589                         Distinct real roots (for a recurrence
Decoding, 763; see also Algebraic            Dierckman, Jeffrey S., 623, 624                  relation), 456-464
  coding theory                              Difference equations, 447; see also           Distinguishing string, 374
Decoding algorithm, 772; see also               Recurrence relations                       Distributions, 29, 150, 263, 264, 304,
  Algebraic coding theory                    Differential equations, 447                      370, 403, 416, 444, 493
Decoding function, 767; see also             Digital computer, 309, 320, 581, 719          Distributive Law
  Algebraic coding theory                    Digital devices, 329, 332                       of matrix multiplication over matrix
Decoding table, 774, 775; see also           Digraph, 349, 352, 514; see also Directed          addition, A-21
   Algebraic coding theory                      graph                                        of multiplication over addition for
Decoding table with syndromes, 776; see      Dijkstra, Edsger Wybe, 632, 667, 669                integers, 221
   also Algebraic coding theory              Dijkstra’s Shortest-Path Algorithm,              of multiplication over addition for real
Decoding with coset leaders, 776, see           631-638                                          numbers, 57
   also Algebraic coding theory              Dinitz, Jetfrey H., 831                          of scalar multiplication over matrix
Decomposition (of a permutation), 781        Diophantine equation, 235, 243                      addition, A-13
Decomposition theorem for chromatic          Diophantus (of Alexandria), 235, 243          Distributive Laws
   polynomials, 568                          Direct argument, 114                             for a Boolean algebra, 734
Decryption, 690-693                          Direct product of cyclic groups of prime         for Boolean functions, 713
Decryption function, 759                        power order, 795                              for Boolean variables, 713
Dedekind, Richard, 243, 303, 377, 706,       Direct product of groups, 751                    for logic, 58
   795                                       Direct proof, 114, 115                          for a ring, 799
Dedekind domain, 706                         Directed arrow, 514                             for set theory, 139
Deductive reasoning, 117                     Directed cycle, 351, 358, 516                 Divide-and-conquer algorithms,
Deficiency of a graph, 664                   Directed edge, 321, 349, 351, 514, 646,          496-503, 507, 606
Deficiency of a set of vertices, 664            650                                        Dividend, 223
Definition, 52, 87, 98, 103-105, 113         Directed Euler circuit, 535, 536              Divides (for integers), 221
Deg (R), 546                                 Directed Euler trail, 539                     Divides (for polynomials), 802
Deg(v), 530                                  Directed graph, 337, 344, 347, 349, 350,      Divides relation, 339, 737
Degree 0, 800                                   351, 353, 357, 377, 378, 488, 514,         Division algorithm
Degree of a polynomial, 799                     587, 631, 632, 644                            for integers, 221, 223, 225, 232, 236,
Degree of a region, 546                         arcs, 349, 514                                   254, 274, 276, 289, 686, 754, 756
Degree of a table, 271                          associated undirected graph, 350, 353,        for polynomials, 803-805, 808-810
Degree of a vertex, 530, 533                       517                                     Division method (for hashing), 694
Delays, 332, 722                                edges, 349, 514                            Divisor
Deletion, 490                                   loop, 349, 514                               for integers, 221, 223, 342, 361
Delong, Howard, 119, 120                        nodes, 349, 514                              for polynomials, 802
Delta, 226                                      strongly connected, 351, 539               Divisors of zero; see Proper divisors of
5(G), 664, 665                                  vertices, 349, 514                             zero
DeMoivre, Abraham, 304, 411, 443, 505        Directed Hamilton path, 559                   d.n.f., 715-718, 721-724
DeMoivre’s Theorem, 208, 464, 465            Directed path, 353, 516, 588, 632, 633,       Doctrine of Chances, 411
DeMorgan, Augustus, 118, 186, 242,              646, 649, 650, 652, 653                    Dodecahedron, 548, 556, 573
   244, 565                                  Directed tree, 587                            Domain (of a function), 175, 253, 257,
DeMorgan’s Laws                              Directed walk, 516                                270, 281, 287
   for a Boolean algebra, 736                Dirichlet, Peter Gustave Lejeune, 303,        Domain (of a relational data base), 271
   for Boolean functions, 713                   705                                        Dombowski,     Peter, 831
   for Boolean variables, 713                Dirichlet drawer principle, 303; see also     Dominance (for functions), 290, 291
  for logic, 57, 58, 60-62                      Pigeonhole principle                       Dominance Laws
   for set theory, 139-141, 148, 149, 163,   Disconnected graph, 352, 517                    for a Boolean algebra, 735
      214                                    Discrete function, 448, 452, 486                for Boolean functions, 713
Denumerable set, 303, A-24                   Discrete probability, 189                       for Boolean variables, 713
Deo, Narsingh, 506, 508, 574, 576            Discrete random variable, 428, 430            Dominates (for functions), 290, 291
Depth-first index, 616, 619                  Discrete sample space, 164, 175               Dominates (on a set), 498
Depth-first search, 597, 598, 600, 617,      Disjoint collection of sets, A-29, A-30       Dominating set, 577, 730
   624                                       Disjoint cycles, 786                          Domination Laws
Depth-first search algorithm, 597, 598,      Disjoint events, 159, 169, 170, 172             for logic, 59
   617                                        Disjoint sets, 137, 148; see also Mutually     for set theory, 139
Depth-first spanning tree, 615-620               disjoint                                  Domination number of a graph, 577
Derangement, 402. 403, 410, 412               Disjoint subboards, 404, 405, 408, 409       Domino,   121, 195, 196, 470
Descendant, 588, 616-619                      Disjunction, 48, 56, 57                      Don’t care conditions, 731-733
                                                                                                             Index                1-7

Dornhoff, Larry L., 333, 334, 778, 796     England, 565                                Even, Shimon, 490, 507
Dorwart, Harold L., 831, 832               Enigma, 333                                 Even integer, 104, 105, 113
Double induction, 306                      Enumeration, 3, 9, 41, 186, 188, 385,       Even parity string, 332
Double negation, 58                           391, 394, 411, 415, 439, 622, 623, 673   Event, 151, 158, 159, 168, 171, 262
Doubly linked lists, 378                   Enumeration of nonisomorphic labeled           Bernoulli trial, 161, 178, 179, 182, 430
Doubly stochastic matrix, 670                trees, 586, 587                              elementary event, 158
Dual code, 773; see also Algebraic         Epp, Susanna S., 119, 120                   Evert, Christine Marie, 54
   coding theory                           Equal likelihood,   150, 151                Eves, Howard, 119, 120, 304, 305
Dual graph, 549, 55]                       Equality                                    Excel, 117
Dual network, 551-553                        of Boolean functions, 712                 Exclusive or, 48, 56, 416, 789
Dual of a statement, 59, 62, 140, 141,       of equivalence classes, 368               Exclusive or (@) for Boolean functions,
   713,735                                   of functions, 279                            719, 720
Duality                                      of matrices, A-12                         EXCLUSIVE-OR gate, 728
   in a Boolean algebra, 713, 735            of polynomials, 799                       Execution speed, 290
  in logic, 59                                of real numbers, 55                      Exhaustion (Method of), 106
  in set theory, 140, 141]                    of sets, 125, 143, 367                   Exhaustive, 457, 474
Dyck, Walther Franz Anton von, 794            of strings, 311                          Existence of an identity for a group, 745
                                           Equality relation, 342, 366, 377            Existence of an identity for a ring, 673
E,, Ex, E, 371                             Equilateral triangle, 475                   Existence of inverses in a group, 745
Em, 374                                    Equivalence class, 367, 368, 371, 377       Existence of inverses under + for a ring,
E(X), 177, 182, 183                        Equivalence problem, 378                       673
East Prussia, 533                          Equivalence relation, 337, 342, 343, 353,   Existential generalization,   117
Echo, 226                                     366-378, 686, 695, 735, 780, 782,        Existential quantifier (3), 87, 88, 94, 96,
Economics, 506                               783, 808, 830                                98
Edge, 349, 514                               block, 366                                Existential specification, 117
Edge of minimal weight, 640                   cell, 366, 367, 369, 372-375             Expansion by minors, A-20
Edge set, 349, 514                            definition, 342                          Expectation, 177
Edge-disjoint paths, 658                      equivalence class, 367, 368, 371, 377    Expected value, 177, 179, 180
Edmonds, J., 653, 654, 669                    partition, 366-375, 377, 378             Experiment, 150-154, 157, 159, 162,
Edmonds-Karp algorithm, 653-657               Stirling numbers of the second kind,        163, 166, 167, 175, 178, 180, 183
Efficiency of a coding scheme, 764; see           370                                  Explicit formula, 210, 211
   also Algebraic coding theory            Equivalent codes, 778; see also             Explicit quantifier, 89, 90
Efficient procedure, 200                     Algebraic coding theory                   Exponent, A-1, A-2
Efficient tree, 611                        Equivalent finite state machines, 327       Exponential function, 402, A-1, A-5
Einstein, Albert, 707                      Equivalent open statements, 92              Exponential generating function,
Electric power network, 667                Equivalent states (s; Es2), 338, 371           436-439, 443, 444, 474
Electric switch, 711                       Eratosthenes, 243                           Exponential order, 293
Electrical engineering, 324                Erdos, Paul, 276, 573, 574                  Exponential time complexity, 293
Electrical network, 551, 573, 574, 581,    Erlanger Programm, 795                      Exponentiation algorithm, 297-299
  622                                      Error correction (in a code), 767-769;      Extension of a function, 257
Electronic realizations of Boolean            see also Algebraic coding theory
   functions, 796                          Error detection (in a code); 767-769; see   f:A—        B,252
Element, 123, 124, 129, 135                  also Algebraic coding theory              f, 712
Element argument, 126, 137, 140, 144       Error in reasoning, 74                      f—', 283
Elementary event, 158                      Error pattern, 762, 763, 771, 779; see      f(A), 253
Elementary subdivision, 542, 543              also Algebraic coding theory             f~' (Bi), 285
Elements, 222, 237, 238, 242               Euclid, 42, 222, 232, 237, 238, 242, 243    f € O(g), 290, 291
Elements of a set, 123                     Euclidean algorithm,                        f € O(g) on S, 498
Elsayed, E. A., 562, 575, 576                 for integers, 231-235, 289, 454, 458,    f € O(g), 294
Else, 51                                         459, 505, 688, 760                    f €2(g), 293
Embedded microcontroller, 5                   for polynomials, 808                     f is dominated by g, 290, 291, 341
Embedding, 540, 545                        Euclidean geometry, 820                     f is dominated by g on S$, 498
Empty language, 313                        Euler, Leonard, 303, 378, 443, 494, 513,    f(x) = g(x) (mod s(+)), 808
Empty set (@), 127, 128, 159                  533, 544, 573, 705, 794, 819, 831        f(x) is congruent to g(x) modulo s(x),
Empty string (A), 310, 323                 Euler circuit, 534, 535, 556                    808
Encoding, 763; see also Algebraic coding   Euler number, 495                           f-augmenting path, 650-654, 656, 663
  theory                                   Euler trail, 534, 535, 556                  F,,, 719, 734
Encoding function, 763, 764, 767, 769,     Euler’s conjecture (Latin squares), 819     Fo (contradiction), 53
  771; see also Algebraic coding theory    Euler’s phi function, 394, 395, 689, 747    F[x], 802
Encoding scheme, 610, 611                  Euler’s Theorem on congruence, 759,         Flx]/(s()), 810
Encryption, 690-693                           760                                      Factor of a polynomial, 802, 804, 805
Encryption function, 759                   Euler’s Theorem on connected planar         Factor Theorem, 804, 805
Enderton, Herbert B., 189, A-32               graphs, 546-548, 573                     Factorial, 6, 7, 215
Endpoint, 660                              Eulerian numbers, 193, 217, 218. 304,       Factorial order, 293
Energy levels, 486                            420                                      Factorial time complexity, 293
I-8              Index

Factorization of a polynomial, 805               internal states, 320, 321, 327, 371        Ford, Lester Randolph, Jr., 649, 653, 654,
Failure, 161, 178                                k-equivalent states, 338, 371                668, 669
Fallacy, 74, 75, 110                             k-unit delay machine, 329                  Ford-Fulkerson algorithm, 654-657
False assumption, 115                            Mealy machine, 333                         Foreign Office at Bletchley Park, 333
Fan, 628                                         minimization process, 371-376, 378         Forest, 581, 639, 641, 642
Fano, Gino, 820, 831                             next state, 320                            Formal Logic; or, the Calculus of
Feit, Walter, 795                                next state function, 320                     Inference, Necessary and Probable,
Feller, William, 444, 506, 507                   1-equivalent states, 371                     118
Fence, 508                                       one-unit delay machine,     329            Formulario Mathematico,      243
Fendel, Daniel, 119, 120                         output, 320-322, 324, 328, 329             Forward edge, 650, 651, 654, 655
de Fermat, Pierre, 243, 244, 705                 output alphabet, 320, 321                  Foulds, L. R., 562, 575, 576
Fermat’s Last Theorem,      705, 706             output function, 320                       Foundations of mathematics, 333
Fermat’s theorem on congruence, 759              pigeonhole principle, 327                  Foundations of the Theory of
Ferrers, Norman Macleod, 443                     reachability, 338                             Probability, 188
Ferrers graph, 435, 443                          reachable state, 330                       Founder of information theory, 795
Fibonacci, Leonardo, 506                         redundant state, 371, 373                  Four-color conjecture, 573
Fibonacci generator, 697                         reset, 321                                 Four-color problem, 565, 575
Fibonacci numbers,       193, 215-217, 219,      second level of reachability, 338          Fourier, Joseph Baptiste Joseph, 303
   246, 442, 447, 457, 458, 463, 468,            sequence recognizer, 326, 327, 332         Foxtrot, 226
   470, 477, 506, 628                            serial binary adder, 323, 324              Fractals, 506
Fibonacci relation, 442, 457, 505                sink (state), 331                          Free variable, 88
Fibonacci sequence, 505                          starting state, 320, 329                   Frege, Gottlieb, 119
Fibonacci trees, 626
                                                 state diagram, 321, 324, 327               Frequency of occurrence, 611, 692
Field, 677, 678, 681, 682, 688, 707, 746,        state table, 321, 322, 324, 331            Frey, Gerhard, 706
   794, 802, 830, 831; see also Finite field
                                                  strongly connected machine, 331           Frobenius, Georg, 796
Field theory, 831                                                                           Front (ofa list), 598, 599
                                                  submachine, 331
Fields (in a record), 694
                                                 transfer sequence, 331                     Fulkerson, Delbert Ray, 649, 653, 654,
FIFO structure, 598                                                                           668, 669
                                                  transient state, 330
Filius Bonaccii, 442                                                                        Full-adder, 721
                                                  transition sequence, 331
Finite affine plane, 820                                                                    Full binary tree, 611
                                                  transition table, 321
Finite Boolean algebra, 740, 743, 799,                                                      Full house, 152
                                                  two-unit delay machine, 329
   830                                                                                      Full m-ary tree, 614
                                               Finite strings, 310
Finite field, 799, 803, 806, 811, 812, 817,                                                 Function, 99, 175, 186, 211, 247,
                                               Finite three-dimensional geometry, 831
   820, 822, 826, 830                                                                         252-257, 259-263, 267-271,
                                               Finizio, Norman, 506, 507
Finite function, 247, 284, 302, 332                                                           278-293, 295, 302, 303, 309, 311, 318,
                                               First-degree factor, 805, 806
Finite geometry, 799, 820, 822, 825, 830,                                                     320, 376, 394, 395, 403, 409, 410,
                                               First-in first-out structure, 598
   831; see also Affine plane                                                                 602, 644, 660, 673, 697-704, 712, 739
Finite group, 795                              First level of infinity, 303, A-30             access function, 254
Finite group theory, 831                       First level of reachability, 338
                                                                                              Ackermann’s function, 259
Finite integral domain, 682                    First-order linear recurrence relations,       associative binary operation, 268
Finite language, 314                              448, 450                                    Big-Oh     notation, 290
Finite poset, 377                              Fisher, R. A., 831
                                                                                              bijective function, 279, 283
Finite projective geometry, 831                Fissionable material, 486                      binary operation, 267-269
Finite projective plane, 831                   Five-times repetition code, 765, 769; see      Boolean function, 712
Finite sample space, 164                          also Algebraic coding theory                ceiling function, 254
Finite sequence of n terms, A-25               Fixed (invariant), 781, 783, 789               characteristic function, 307
Finite sequence of undirected edges, 351       Fixed order, 597                               closed binary operation, 267, 268, 270
Finite set, 124, 125, 186, 280, 287, 344,      Fixed point (of a function), 403               codomain, 253, 279, 281, 287
   A-23, A-24                                  Flach, Matthias, 706                           commutative binary operation, 268,
Finite slope, 821                              Floor function (|x ]), 253, 254, 297, 496,        270
Finite state machine, 309, 319-324,               602                                         composite function, 280, 281
   326-333, 337, 338, 371-376, 378,            Flow in a transport network, 644-654,          composition of functions, 278, 280,
   682, 720                                       656. 662, 663                                  282
   arc, 321, 329                               Flow of current, 536                           constant function, 261
  definition, 320                              Flowchart, 203, 204, 349                       decoding, 767
  directed edge, 321                           Folding method (for hashing), 694              definition, 252
  distinguishing string, 374                   Fontane, Johnny,     186                       distance function, 766, 767
  E, 371                                       Vr, 88, 124                                    domain, 175, 253, 257, 270, 281, 287
  Ej, 371                                      For all x, 88                                  dominance, 292-294
  Ex, 371, 374                                 For any x, 88                                  encoding, 763, 764, 767, 769, 771, 773
  equivalent machines, 327                     For at least one x, 88                         equality, 279
  equivalent states, 338                       For each x, 88                                 Euler’s phi function, 394, 395, 689
  first level of reachability, 338             For every x, 88                                exponential, 402, A-1, A-5
  input, 320, 322, 324, 329                    For some x, 87, 88                             extension, 257
  input alphabet, 320, 321                     Forbidden positions, 406, 408                  f!, 283
                                                                                                                    Index           1-9

finite function, 247, 284, 302             Fundamental Theorem of Arithmetic,             convolution of sequences, 430, 431,
  finite sequence of n terms, A-25              193, 237-240, 244, 254, 265, 275,              440
  fixed point, 403                             314, 342, 394, 703, 704, A-29                definition, 418
  floor function, 253, 254, 297                                                             distributions, 415-417
  function complexity, 247                   go f, 280                                      exponential generating functions,
  function dominance, 290-292, 294,          g dominates f, 290                                436-439, 443
     498                                     g dominates f on S, 498                        geometric series, 419
  greatest integer function, 253, 297        G, 523                                         in solving recurrence relations,
  hashing function, 673, 694, 695, 708       G4, 549                                           482-487
  identity function, 279                     G?, 626                                        moment generating function, 443, 444
  image of an element, 253                   G — e (e an edge), 522                         nonlinear recurrence relation, 487-490
  image of a set, 256, 257                   G — vp (va vertex), 522                        ordinary generating function, 436
  incompletely specified Boolean             |G|, 746                                       partitions of integers, 432-435
     function, 732                           Galileo, 303                                   power series, 417
  infinite sequence, A-25                    Gallian, Joseph A., 707, 708, 795, 796         rook polynomial, 416
  injective function, 255                    Gallier, Jean H., 119, 120                     summation operator, 440-442
  inverse function, 278, 283, 285, A-9       Galois, Evariste, 707, 794, 795, 813, 830,     table of identities, 424
  invertible function, 282-285, 287             831                                       Generator matrix, 769, 771, 772, 774,
  logarithmic, A-1, A-5                      Galois field, 813, 818                          777, see also Algebraic coding theory
  mapping, 252                               Galois theory, 707, 795, 831                 Generator of a cyclic group, 755
  monary operation, 267                      Gambler’s ruin, 510                          Generic, 110
  monotone increasing function, 494,         Games of chance, 188                         Genesereth, Michael R., 119, 120
     495, 500, 501, 503, 608, 609            y(G), 577                                    Geometric progression, 447
  next state function, 320, 682              Gardiner, Anthony, 795, 796                  Geometric random variable, 430, 446
  notation, 253                              Gardner, Martin, 39, 42, 507, 795, 796       Geometric    series, 419, 423, 428, 476
  14,279                                     Garland, Trudi Hammel, 506, 507              Geometrie die Lage, 622
  one-to-one correspondence, 279, 303        Garrett, Paul, 693, 708, 795, 796            Geometry, 123, 222, 242, 506, 794, 795
  one-to-one function, 255—257, 409,         Gate, 720                                    Gerasa, 707
     410                                     Gating network, 309, 719-722, 731            Germain,    Sophie, 705
  onto function, 260-263, 265                Gauss, Carl Friedrich, 377, 705, 707         Gersting, Judith L., 333, 334
  order (of a function), 290, 292, 293       gcd (greatest common divisor)                GF, 813
  order-preserving function, 366, 509           for integers, 231-236, 240, 394, 453,     GF(n), 821, 824, 827, 828
  output function, 320, 682                        454, 688, 734, 737                     GF(p"), 830
  partial function, 260                        for polynomials, 807, 808                  G F(p'), 813, 818
  phi function, 394, 395                     General solution of a homogeneous            Gilbert bound, 773; see also Algebraic
  powers of a function, 282                     recurrence relation, 468                     coding theory
  pred (predecessor), 307                    General solution of a nonhomogeneous         Gill, Arthur, 333, 334
  preimage of an element, 253                   recurrence relation, 471                  Giornale di Matematiche, 820
  preimage of a set, 285-287                 General solution of a second-order linear    Global result, 632, 639
  projection, 270, 271                          homogeneous recurrence relation with      gib (greatest lower bound), 363, 709
  range, 253                                    constant coefficients, 456                Gédel, Kurt, 187
  recursive function, 453                    Generalizations of the principle of          Gédel’s proof, 188
  restriction, 257                              inclusion and exclusion, 397-401          The Godfather, 186, 692
  scattering function, 694, 708              Generalized associative law for A, 212       Golay, Marcel J. E., 761, 795, 796
  self-dual Boolean function, 744            Generalized associative law for U, 213       Goldberg, Samuel, 506, 507
  sequence, 255                              Generalized associative law for a group,     Golden ratio, 457, 469, 506
  space complexity function, 290                746                                       Golomb, Solomon W., 796
  succ (successor), 307                      Generalized associative law of addition      Gone with the Wind, 47, 48, 52
  surjective function, 260                     of real numbers, 214-216                   Gopolan, K. Gopal, 743
  switching function, 712                    Generalized associative law of               Gorenstein, Daniel, 795, 796
  symmetric Boolean function, 744               multiplication of real numbers, 214,      Graceful (labeling of a tree), 627, 628
  time complexity function, 290,                215                                       Graff, Michael, 795
     297-299                                 Generalized associative laws in a ring,      Graham, Ronald Lewis, 304, 305, 506,
  trunc(ation), 254                             674                                          507, 642, 667-669
  unary operation, 267, 268                  Generalized Binomial Theorem, 422            Grandparent, 593
Function complexity, 247                     Generalized DeMorgan’s laws, 146             Graph coloring, 564-573, 575
Function composition; see Composite          Generalized distributive laws in a ring,     Graph isomorphism, 523, 526-528, 699
   function                                     674                                       Graph planarity, 352, 615
Function dominance, 292, 294, 341, 498       Generalized intersection of sets, 146        Graph theory, 324, 349-354, 378, 379,
Function inverse; see Inverse of a           Generalized union of sets, 146                 395, 396, 411, 513-579, 615-621,
   function                                  Generated recursively, A-26                    624, 631, 632, 657, 659-665, 667,
Fundamental conjunction, 715-718, 721,       Generates, 753                                 730; see also Matching theory,
   723, 724, 732, 738                        Generating function, 303, 415-445, 452,        Transport networks, Trees
Fundamental disjunction, 717, 718               482-487, 489, 505, 783, 790, 791            adjacency list, 379
Fundamental Theorem        of Algebra, 356      calculational techniques, 418-431            adjacency list representation, 378, 379
I-10            Index

adjacency matrix, 352, 539, 600           distance, 518, 626                      ladder graph, 572, 577, 626, 627
  adjacent from, 349, 514                   dominating set, 577, 730                length of a cycle, 351
  adjacent to, 349, 514                     domination number, 577                  length of a path, 632
  adjacent vertices, 349                    dual graph, 549, 551                    length of a walk, 515
  algorithm for articulation points, 619,   edge-disjoint paths, 658                line graph, 578, 670
      620                                   edge set, 349, 514                      loop, 349, 351, 353, 354, 514, 551
  arc, 349, 514                             edges, 349, 514                         loop-free graph, 351, 515
  articulation point, 615-621, 624          electrical networks, 551, 573, 574      mapmaker’s problem, 551
  associated undirected graph, 350, 353,    elementary subdivision, 542, 543        maximal independent set, 564, 627
     517                                    embedding, 540, 545                     mesh graph, 532
  £(G),   the independence number of G,     Euler, Leonard, 378                     minimal covering of a graph, 577
     564, 666                               Euler circuit, 534, 556                 minimal dominating set, 577, 730
  biconnected component, 615,               Euter’s Theorem for Connected Planar    multigraph, 516, 518, 533
     619-621, 624                              Graphs, 546-548, 573                 multiplicity (of an edge), 518
  biconnected graph, 615                    Euler trail, 534, 556                   n-cube, 532, 541, 542
  binary tree, 488, 595, 600                fan, 628                                nodes, 349, 514
  bipartite graph, 541, 542, 558, 659,      Four-color problem, 565, 575            nonplanar graph, 540, 541, 543, 547
     660, 662-665, 668                      G, 523                                  null graph, 523
  bridge, 550                               G4, 549                                 od(v), 535
  x(G), the chromatic number of G,          G?, 626                                 w(G), the clique number of G, 578
     565, 621                               G — e (e an edge), 522                  one-factor, 666
  chromatic number, 413, 565, 615, 621      G — v (v a vertex), 522                 one-terminal-pair-graph, 552
  chromatic polynomial, 413, 564-571,       y(G), the domination number of G,       open walk, 515
     574                                      S77                                   origin (of an edge), 349, 514
  circuit, 516, 534, 551                    graceful labeling of a tree, 627, 628   out degree (of a vertex), 535
  clique, 578                               graph coloring, 564-573, 575            outgoing degree (of a vertex), 535
  clique number, 578                        graph isomorphism, 523, 526-528,        P(G, A), 566-568, 570
  closed path, 351, 516                        699                                  path, 351, 516, 567
  closed walk, 515, 516, 546, 549           grid graph, 532                         pendant vertex, 530, 549, 583, 584
  cocycle, 564                              Hamilton cycle, 556-562, 573, 574       perfect matching, 666
  color-critical graph, 573, 622            Hamilton path, 556-561, 573             Petersen graph, 543, 566, 574
  comb graph, 577                           Hasse diagram, 358-361                  planar graph, 540-553
  complement of a graph, 523                Herschel graph, 564, 566                planar-one-terminal-pair-graph, 552
  complement of a subgraph in a graph,      historical development, 574             planarity of graphs, 352, 615
     586                                    homeomorphic graphs, 542-544            Platonic solids, 547-549, 556
  complete bipartite graph, 541]            hypercube, 531-533, 541, 542, 557       Polya’s theory of enumeration, 574
  complete directed graph, 559              id(v), 535                              precedence graph, 350
  complete graph, 352, 354, 523             in degree (of a vertex), 535            proper coloring of a graph, 565-568,
  components, 352, 517, 567                 incidence matrix, 539                      570
  connected graph, 351, 517                 incident, 514                           QO, 532, 542, 667
  covering of a graph, 577                  incoming degree (of a vertex), 535      regions (in a planar graph), 544
  covering number, 577                      independence number, 564, 666           regular graph, 531
  cut-set, 549, 551                         independent set of vertices, 564, 627   rooted binary tree, 488
  cycle, 351, 516, 551, 552, 624            index list, 379                         rooted ordered binary tree, 488, 489
  d(a, b), 626, 632                         induced subgraph, 522, 619              round-robin tournament, 559
  Decomposition Theorem for                 infinite region, 545                    self-complementary graph, 529, 576
     Chromatic Polynomials, 568             Instant Insanity, 524                   Seven Bridges of K6nigsberg, 513,
  deficiency, 664                           intersection of graphs, 570                519, 533, 535
  deficiency of a graph, 664                isolated vertex, 349, 352, 514, 613     source (of an edge), 349, 514
  deg(R), 546                               isomorphic graphs, 526                  spanning subgraph, 521, 582, 640
  deg(v), 530                               Kn, 541                                 spokes, 519, 520
  degree of a region, 546                   Kn, 352, 523                            square of a graph, 626
  degree of a vertex, 530                   KF, 559                                 strongly connected graph, 351, 539
  5(G), 664, 665                            Ks, 540-543, 547                        subgraph, 521
  digraph, 349, 350, 514                    K33, 542, 543, 547                      terminals, 552
  Dijkstra’s Shortest-Path Algorithm,       «(G), the number of components of       terminating vertex, 349, 514
     631-638                                    G, 517, 549, 615                    terminus (of an edge), 349, 514
  directed cycle, 351, 516                  k-regular graph, 531                    tournament, 559
  directed edge, 321, 349, 351, 514, 646,   king, 563                               trail, 516
     650                                    kite, 628                               Traveling Salesman Problem, 562, 574
  directed Euler circuit, 535, 536          Konigsberg, 513, 519, 533, 535          tree, 573
  directed graph, 337, 344, 349, 514        Kuratowski’s Theorem, 543, 544, 574     trivial walk, 515
  directed path, 353, 516                   L(G), 578, 670                          2-isomorphic graphs, 555
  directed walk, 516                        labeled directed graph, 324             undirected edge, 349, 514
  disconnected graph, 352, 517              labeled multigraph, 524                 undirected graph, 350, 351, 514
                                                                                                                    Index          I-11

union of graphs, 570                          group of units, 747                         Hamilton path, 556-561, 573
  unit-interval graph, 520                      homomorphism, 752                           Hamming, Richard Wesley, 761, 766,
  unity graph, 542                              infinite order, 746                           795, 796
  vertex, 349                                   invariant element under a permutation,      Hamming bound, 773; see also Algebraic
  vertex degree, 530                               781, 783                                   coding theory
  vertex set, 349, 514                          isomorphism, 753                            Hamming code, 778; see also Algebraic
  vertices, 349, 514                            kernel of a homomorphism, 797                 coding theory
   W,,, 520, 572                                Klein Four group, 755                       Hamming matrix, 778; see also
   walk, 515, 516                               Lagrange’s Theorem, 758                       Algebraic coding theory
   weight of an edge, 631                       left-cancellation property, 747             Hamming metric, 767; see also Algebraic
   weighted graph, 631                          left coset, 757                               coding theory
   wheel graph, 519, 520, 572                   length of a cycle, 780                      Handshakes, 480
Gray, Frank, 188                                multiples of group elements, 748            Hanson, Denis, 412
Gray code, 128, 129, 188, 533, 557, 564         nonabelian group, 749                       Harary, Frank, 573-576, 623, 625
Greatest common divisor                         nontrivial subgroup, 748                    Hardware considerations, 333, 378
   for integers, 231-236, 240, 394, 453,        normal subgroup, 795, 831                   Hardy, Godfrey Harold, 244, 412
      454, 688, 734, 737                        order of a group, 746                       Harmonic numbers, 193, 202, 209, 215,
   for polynomials, $07, 808                    order of a group element, 754                  246
Greatest element (in a poset), 363              Polya’s method of enumeration,              Hartsfield, Nora, 573, 576
Greatest integer function (|x |), 253, 297,        779-793                                  Harvard University Computation
   391, 496, 602                                powers of group elements, 747                 Laboratory, 742, 743
Greatest lower bound (glb), 363                 product of disjoint cycles, 780, 731,       Hashing function, 673, 694, 695, 708
Greedy algorithm, 632, 638-641, 667                786                                      Hasse, Helmut,   377
Gregory, Duncan, 186                            proper subgroup, 748                        Hasse diagram, 358-361, 377, 476, 533,
Grid, 45                                        quotient group, 831                            696, 736-739
Grid graph, 532                                 right-cancellation property, 747            Heap, 637, 638
Griess, Robert, Jr., 795                        right coset. 757                            Heap implementation, 642, 643
Group, 745                                      rigid motions of a cube, 791                Heath, Thomas Little, 41, 42
Group acting on a set, 782, 785, 792            rigid motions of an equilateral triangle,   Heawood,   Percy John, 565
Group action, 783                                  749, 750                                 Height of a rooted tree, 601
Group code, 773, 774, 776, 777: see also        rigid motions of a regular hexagon,         Hell, Pavol, 642, 667-669
  Algebraic coding theory                          788                                      Henle, James M., 189, A-32
Group homomorphism, 752, 753, 774               rigid motions of a regular tetrahedron,     Herschel graph, 564, 566
Group isomorphism, 753, 755                        792, 793                                 Herstein, Israel Nathan, 795, 797
Group of permutations, 749, 750, 781,           rigid motions of a square, 750, 780         Hexadecimal notation, 226
  782, 830                                      RSA Cryptosystem, 759-761                   Hexagon, 135, 788
Group of rigid motions                          simple group, 795                           Hierarchy of operations, 460, 590
  of a cube, 791                                Sn, 750, 794                                High-energy neutrons, 486
  of an equilateral triangle, 749, 750           solvable group, 830                        Hilbert, David, 119, 188, 259, 333, 706
  of a regular hexagon, 788                      stabilizer, 785                            Hilbert decision problem, 333
  of a regular tetrahedron, 792, 793             subgroup, 748                              Hill, Frederick J., 742, 743
  of a square, 750, 780                          symmetric group, 750                       Hindu mathematicians, 243
Group of transformations, 794, 795              trivial subgroup, 748                       Hindu-Arabic notation, 442
Group of units, 747                           Grundbegriffe der                             History of enumeration, 4]
Group theory, 745-798                            Wahrscheinlichkeitsrechnung                History of graph theory, 574
  abelian group, 745, 746                       (Foundations of the Theory of               Hodges, Andrew, 333, 334
  algebraic coding theory, 773-777              Probability),   188                         Hoggatt, Verner E., Jr., 506, 507
  center, 751                                 Grundlagen der Mathematik, 119                Hohere Algebra, 377
  chain of subgroups, 830                     Gruppentheoretischen Studien H, 794           Hohn, Franz E., 333, 334, 778, 796
  commutative group, 745                      Guthrie, Francis, 565, 573                    Homeomorphic graphs, 542-544
  coset, 757                                  Guthrie, Frederick, 565                       Homogeneous      recurrence relations, 450,
  cycle, 780, 781                             Guy, Richard K., 506, 507                       456, 482
  cyclic group, 753-756, 758                                                                Homomorphic image (rings), 698
  decomposition of a permutation, 781         Hf, (the nth harmonic number), 202            Homomorphism of groups, 752
  definition of a group, 745                  Haken, Wolfgang, 565, 573, 575                Homomorphism of rings, 698
  direct product of groups, 751               Half-adder, 720, 721                          Honsberger, Ross, 506, 507
  Euler’s Theorem on Congruence, 759,         Half-open interval, 134                       Hopcroft, John E., 333, 334, 378, 506,
     760                                      Hall, Marshall, Jr., 412, 831, 832               507, 574, 575, 623, 624, 642, 667,
  Fermat’s Theorem on Congruence, 759         Hall, Philip, 660, 663, 668                      668, 708
  fixed (invariant), 781, 783                 Hall’s Marriage Condition, 664                Hopper, Grace, 623
  generator of a (sub)group, 754              Halmos, Paul R., 189, A-32                    Horizontal class, 822
  group acting on a set, 782, 785, 792        Hamilton, Sir William Rowan, 186, 556,        Horner’s method, 301
  group of permutations, 749, 750, 781,          565, 573, 574                              Horowitz, Ellis, 641, 642, 668, 669
     782, 830                                 Hamilton cycle, 556-559, 561, 562, 573,       Huffman, David Albert, 333, 334, 378,
  group of transformations, 794, 795            574                                            611, 624, 625
I-12             Index

Huffman tree, 613, 614                       Incident, 514                                  Integer-valued function, 254
Huffman’s construction for optimal trees,    Inclusive or, 48                               Integers, 113, 114, 133, 193, 242
  612-614                                    Incoming degree of a vertex, 535               Integers modulo n, 686-696
Hungarian method, 668                        Incompletely specified Boolean function,       Integral domain, 677, 678, 681, 682, 801,
Huygens, Christiaan, 42, 188                    731, 732                                       802
Hydrocarbon,     581, 584                    Increment, 689                                 Intel Corporation, 5
Hydrogen, 584                                Independence for three events,    171,   172   Internal states, 320, 321, 327, 337, 371
Hypercube, 531-533, 541-544, 557, 667        Independence number of a graph, 564,           Internal vertices, 588
Hypothesis, 48, 51, 53, 67, 70                 666                                          Internet, 12, 13, 575
Hypothesis Testing, 188                      Independent, 786                               Internet address, 12
                                             Independent    events, 154, 155, 158, 161,     Internet security, 222
i(=/-1), 811                                    166, 170,   174, 179, 182, 428, 430, 762    Internet standard regarding reserved
1 Principii di Geometrica, 377               Independent    in pairs, 172                      network numbers (STD2), 12
I, 348, A-16                                 Independent    set of vertices, 564, 627       Intersection of graphs, 570
(i, j)-entry of a matrix, A-11               Independent solutions; see Linearly            Intersection of sets, 136, 138, 214
Icosahedron, 548                                independent solutions                       Introductio in Analysin Infinitorum, 443
id(a), 644, 646                              Independent switches, 64, 65                   Invalid argument, 74, 75, 82, 83, 109
id(v), 535                                   Indeterminate, 799                             Invariant (element under a permutation),
Ideal, 684, 700, 706                         Indeterminate form, A-1                           781, 783, 784, 786, 787, 789
Idempotent element (in a ring), 697          Index, 145                                     Inventory, 786
Idempotent Law of Addition, 718, 724,        Index list, 379                                Inverse (under addition), 278
   726, 732                                  Index of a product, 239                        Inverse (under multiplication), 278
Idempotent Law of Multiplication, 717        Index ofa summation, 17                        Inverse function, 278, 283, 285, A-9
Idempotent Laws                              Index set, 145, 366, 367                       Inverse of an implication, 62, 63, 82,
   for a Boolean algebra, 735                Indirect method of proof, 82                      92-94, 99
   for Boolean functions, 713                Indirect proof,   115                          Inverse laws
   for Boolean variables, 713                Induced subgraph, 522, 619                       for a Boolean algebra, 734
   for logic, 58                             Induction, 534, 545                              for Boolean functions, 713
   for set theory, 139, 147                  Induction hypothesis, 196, 198, 199, 201,         for Boolean variables, 713
Identical containers, 493                       203-205, 207, 208, 214, 216, 238,              for logic, 58
Identity element for a binary operation,        298, 315, 317, 805                             for set theory, 139
   269, 270                                  Inductive proof, 213                           Inverses in a group, 745, 794
Identity element for concatenation, 311      Inductive step, 195-199, 201-204, 206,         Inverses under + in a ring, 673
Identity element for + in a ring, 673           207, 212-215, 218                           Inverter, 719, 720, 722
Identity element of a group, 745, 794        Infeld, Leopold, 831, 832                      Invertible function, 282-285, 287, A-23
Identity for the addition of real numbers,   Infinite area, 545                             An Investigation in the Laws of Thought,
   103                                       Infinite cardinal numbers, A-31                   on Which Are Founded the
Identity function, 279, A-24                 Infinite countable set, A-26                      Mathematical Theories of Logic and
Identity laws                                Infinite order (for a group), 746                 Probability, 119
   for a Boolean algebra, 734                Infinite region, 545                           An Investigation of the Laws of Thought,
  for Boolean functions, 713                 Infinite sample space, 164                        186, 711, 742
   for Boolean variables, 713                Infinite sequence, A-25, A-26                  Irreducible polynomial, 807, 810, 811,
   for logic, 58                             Infinite set, 124, 186, 189, 280, 304,            830
   for set theory, 139                          A-23-A-26, A-28, A-30                       Irreflexive relation, 344
Identity transformation, 791, 792            Infinite slope, 821, 822                       Irrational numbers, 356, A-2
If and only if, 48                           Infix notation, 251, 591                       Irrational power, A-3
If p, then g, 51                             Information retrieval, 694                     Is approximately equal to (=), 7
If p, then g, else r, 5]                     Information theory, 795                        Isobutane, 584
{f-then decision structure, 51               Initial condition(s), 448, 456                 Isolated fundamental conjunction, 724
If-then statement, 62                        Initial flow, 652, 654                         Isolated product term, 724
If-then-else decision structure, 51          Initialization, 636, 639, 642                  Isolated vertex, 349, 352, 359, 514
Iff, 48                                      Injective function, 255                        Isomers, 573, 796
Ignition system, 5                           Inorder, 594                                   Isomorphic Boolean algebras, 739, 740
Image of an element, 253, 255                Inorder traversal, 594                         Isomorphic copy, 809
Image of a set, 256, 257                     Input alphabet, 320, 321                       Isomorphic finite fields, 813
Implication, 48, 51-53, 56, 61-63, 67,       Input (for a finite state machine), 309,       Isomorphic graphs, 526, 527, 542, 543,
   69, 70, 76, 83, 89, 104, 105, 124            319, 320, 322, 324, 329                        549
Implicit quantification, 104                 Input (for a gate), 719, 720                   Isomorphic groups, 753, 755
Implicit quantifiers, 90                     Input (for an algorithm), 253, 289             Isomorphic rings, 698, 699, 704
Implicit restriction, 218, 317               Input (function), 253                          Isomorphic trees, 582, 583
Implies, 48                                  Input string, 321, 322, 327, 330, 331          Isomorphism of
In degree ofa vertex, 535                    Instant Insanity, 524, 525                       Boolean algebras, 737, 740
Incidence, 123                               Integer division, 222                            fields, 810-811
Incidence matrix for a design, 832           Integer solutions, 235, 392, 415-417,            finite fields, 813
Incidence matrix for a graph, 539              427, 433                                       graphs, 523, 526-528
                                                                                                                Index          I-13

groups, 753                             Kolmogorov, Andrei Nikolayevich,      159,   Laws of logic, 58-65, 74, 77, 83, 113,
   rings, 698                                 188, 189                                     139, 140, 211, 713, 735
   trees, 596                              K6nig, Dénes, 573                            Laws of set theory, 139, 144, 163, 168,
Itantum processor, 5                       KG6nigsberg, 378, 513, 518, 533-535, 573        169, 713, 735
Iteration, 634-637, 639-642, 652, 653,     Koshy, Thomas, 506, 507                      Lay, David C., A-2]
   656                                     Kronecker, Leopold, 242, 705, 795            Icm (least common multiple), 236, 240,
Iterative algorithm, 477, 478              Kruskal, Joseph Bernard, 638, 667, 669          391, 734, 737, 739
Iverson, Kenneth, 623                      Kruskal’s algorithm, 639-64 |                Le Probleme des rencontres, 411
Iwasawa theory, 706                        Kummer, Ernst, 706                           Leading coefficient, 799, 806
                                           Kuratowski, Kasimir, 543, 573                Leaf, 588, 591, 593, 596, 597, 600, 601,
Java, 4, 13, 345                           Kuratowski’s Theorem,    543, 544, 574          611, 612
Jean, Roger V., 506, 507                                                                Least common multiple, 236, 240, 391,
Jefferson, Thomas, 54                      £o3, 827, 828                                   734, 737, 739
Jiushao, Qin, 707                          L(G), 578, 670                               Least element (in a poset), 363
Johnson, D. B., 642, 668, 669              Ly, (the nth Lucas number), 216              Least element (well-ordered set), 194
Johnson, Lyle, 623                         Label, 633-636                               Least significant, 731
Johnson, Selmer Martin, 506, 507           Labeled complete binary tree, 610            Least significant bit, 323, 324
Jordan, Marie Ennemond, 622                Labeled directed graph, 324                  Least upper bound (lub), 363
Jiinger, M., 562, 576                      Labeled graph, 562, 634, 636                 Leaves of a plant, 505
Juxtaposition, 301, 311                    Labeled multigraph, 524, 525                 Left branch, 488
                                           Labeled tree, 586, 611                       Left-cancellation property (in a group),
Km n, 541                                  Labeled trees on n vertices, 623                747, 757
Kn, 352, 523                               Labyrinth, 623                               Left child, 590, 594, 610, 611
K7, 559                                    Ladas, Garasimos, 506, 507                   Left children, 594, 595
Ks, 540, 543                               Ladder graph, 572, 577, 626, 627             Left coset, 757
K3.3, 542, 543                             Lagrange, Joseph-Louis, 510, 752, 794        Left subtree, 590, 592, 594, 596, 614
k-ary operation, 306                       Lagrange’s Theorem, 758                      Legendre, Adrien-Marie, 705
k-equivalence, 371, 373                    A (the empty string), 310, 323               Lehman, John, 623
k-equivalent states, 338, 371, 372         A (for a design), 826                        Lehmer, Derrick H., 689
k-regular graph, 531                       1”, 567                                      Leibniz, Gottfried Wilhelm, 118, 302
k-unit delay machine, 329                  Axy, 825                                     Leiserson, Charles E., 504, 507, 624,
«(G), 517, 549, 615                        Lamé, Gabriel, 458, 505, 705                    625, 638, 643, 654, 667
Karnaugh, Maurice, 722, 742, 743           Lamé’s Theorem, 459                          Lemma,   222
Karnaugh map, 722-726, 729, 731, 732       Landau, Edmund, 304                          Length of a
   don’t care conditions, 731—733          Landau symbol, 304                              chain, 381
Karp, Richard M., 653, 654, 669            Language, 211, 309, 312-317,     328, 333,     cycle (in a graph), 351
Katz, Nick, 706                               338                                         cycle (in group theory), 780
Katz, Victor J., 189                       de Laplace, Pierre Simon, 150, 188, 443        path, 632
Kempe, Sir Alfred, 565                     Largest possible block of adjacent 1’s,        string, 18, 310-312
Kepler, Johannes, 505                         726                                          walk, 515
Kernel of a group homomorphism, 797        Larney, Violet Hachmeister, 244, 707,        Lenstra, Arjan, 795
Kernel of a ring homomorphism,       704      708, 795, 796, 831, 832                   Lenstra, J. K., 562, 575, 576
Kershenbaum, A., 642, 668                  Larson, Harold J., 444                       Leonardo of Pisa, 442, 505
Key, 295, 302, 501-503, 691-695, 759,      Last nonzero remainder, 232, 235, 808        Lesniak, Linda, 573, 576
   760                                     Last-in-first-out structure, 490             Less than [for (0, 1)-matrices], 347
Key, J. D., 796                            Latin square (in standard form), 816, 817    Less than or equal to, 364, 377
Khan, Genghis, 707                         Latin squares, 799, 815-820, 822-824,        Level, 588, 589, 593, 597, 607, 611
Khowéarizm,   242                             831                                       Level number, 588, 601, 602, 612
Kimberling, Clark, 707, 708                Lattice, 364, 377                            Levels of gating, 722
King (of a tournament), 563                Lattice point, 277                           Levels of infinity, 303
Kings (on a chessboard), 510               Law of the Double Complement                 LeVeque, William Judson, 244
Kinney, John J., 175, 189                     for a Boolean algebra, 736                Lewis, Harry R., 333, 334
Kirchhoff, Gustav, 573, 581, 622             for Boolean functions, 713                 Lewis, James T., 305
Kirkman, Thomas P., 562                      for Boolean variables, 713                 Lexicographic order, 589, 593
Kitab al-jabr w’al muquabala, 242            for set theory, 139                        Leyland, Paul, 795
Kite, 628                                  Law of Double Negation, 58, 59, 61, 62       L’Hospital’s Rule, A-1
Kleene, Stephen Cole,    119,   120, 315   Law of the syllogism, 72, 73, 78,   108,     Liber Abaci, 442, 505
Kleene closure (of a language), 315, 322      127                                       LIFO structure, 490
Klein, Felix, 795                          Law of Total Probability, 169, 170, 173      limysa f(x) = L, 99, 100
Klein Four group, 755                      Law of Total Probability (Extended           limy oo fn = L, 103
Kneiphof, 533                                Version),   173                            Limit of a real-valued function, 99, 100
Knuth, Donald Ervin, 304, 305, 378, 506,   Lawler, Eugene L., 562, 575, 576, 667,       Limit of a sequence of real numbers, 103,
   624, 625, 704, 708                         669                                          A-3
Koch snowflake curve, 475                  Laws for Boolean functions, 713, 735         Line, 123
Kohavi, Zvi, 333, 334, 378                 Laws for Boolean variables, 713              Line at infinity, 828
I-14            Index

Line graph, 578, 670                          Loop, 349, 351, 353, 354, 358, 488,           Mathematical definition; see Definition
Line in AP(F), 821, 826-828                      514-516, 525, 549, 551, 582, 640           Mathematical induction, 84, 193, 194,
Line in a finite projective plane, 827, 828   Loop-free graph, 351, 352, 396, 515,            200, 203, 206, 214, 215, 243, 244,
Linear algebra, 466, 624                        533, 581, 582, 584, 585, 615-619,             317, 674, 746, 805: see also Principle
Linear arrangement, 6, 7, 9, 10                 624, 631, 639-642, 644, 667                   of Mathematical Induction, Alternative
Linear combination (atoms), 739               Lord Byron, 242                                 form of mathematical induction
Linear combination (integers), 221,           Lovasz, Laszlo, 573                           Mathematical logic, 118, 119, 711
  232-234                                     Lovelace, Augusta Ada Byron, 243              Mathematical theorems, 104
Linear combination (polynomials), 808         Lovelace, Countess of, 242                    Mathematical Treatise in Nine Sections,
Linear complexity, 299                        Low (x), 619-621                                707
Linear congruence, 688                        Low-energy neutrons, 486                      Mathematics of finance, 473
Linear congruential generator, 689, 690       Lower bound, 363                              Matrix, 254, A-11—-A-2]
Linear factor of a polynomial, 805            Lower bound (for probability), 183, 184         addition of matrices, A-12
Linear linked lists, 694                      Lower limit in product notation, 239            additive identity, A-13
Linear order, 293, 359                        Lower limit in sum notation, 17                 additive inverse, A-13
Linear recurrence relation, 449, 506          Lozansky, Edward, 304, 305                      associative law of multiplication, A-16
Linear search, 296, 297, 302                  lub (least upper bound), 363, 709               column matrix, A-11
Linear time complexity, 293, 359              Lucas, Francois Edouard Anatole, 468,           column vector, A-11
Linearly independent solutions, 456, 464        505                                           commutative law of addition, A-12
Linearly ordered poset, 359                   Lucas numbers, 193, 216, 217, 220, 246,         definition, A-11
Linked lists, 378                               447, 506                                      determinant, A-17—A-2]
List (in a relational data base), 271         Lukasiewicz, Jan, 591                           distributive laws of matrix
Literal, 715, 716, 722-726                                                                       multiplication over matrix addition,
                                              M2(C), M2(Q), M2(R), M2(Z), 674                    A-21
Liu, C. L., 42, 412, 444, 506, 507, 535,
   543, 551, 573, 574, 576, 624, 625,         m Xn matrix, A-11                               distributive law of scalar
   667-669, 783, 792, 796, 797, 831, 832      (m + 1, m) parity-check code, 764, 765             multiplication over matrix addition,
                                              Machine language, 226                              A-13
Lloyd, E. K., 574, 575
                                              Machine language instructions, 302              equality, A-12
Local address, 12
                                              m-ary tree, 600                                 expansion by minors, A-20
Local result, 632, 638
                                              Maclaurin, Colin, 304                           (i, j)-entry, A-11
Lockett, J. A., 562, 575
                                              Maclaurin series, 304, 402, 422, 436,           matrix product, A-14
Logarithmic function, A-1, A-5
                                                 437, 496                                     matrix sum, A-12
Logarithmic order, 293
                                              Maclaurin series for e*, 402                    minor, A-19
Logarithmic time complexity, 293
                                              MacWilliams, F. Jessie, 795, 797                multiplicative identity, A-16
Logic, 47-121
                                              Magnanti, Thomas L., 562, 575, 576,             multiplicative inverse, A-16
  basic (logical) connectives, 47-53, 56
                                                 637, 643, 654, 668                           product, A-14
  Laws of Logic, 58-65
                                              Main memory,    5                               row matrix, A-!1
  logical equivalence, 56-61, 83, 95
                                              Majority rule, 765; see also Algebraic          row vector, A-11
  logical implication, 69-73, 75, 89, 91,        coding theory                                scalar product, A-12-A-14
     95                                       Manohar, R., 704, 708                           square matrix, A-11
  logically equivalent statements, 56, 58,    Maple code, 420, 477                            sum, A-12
     61-64, 74, 91, 97, 98                    Mapmuker’s problem, 551                         system of linear equations, A-18
  negation of quantified statements, 96,      Mapping, 252; see also Function                 zero element, A-13
     97, 99, 100                              Marriage condition, 664                       Matrix multiplication algorithm, 507
  Principle of Duality, 59                    Massasauga, 9                                 Matrix product, A-14
  proof, 105-116                              Master theorem, 504, 505, 507                 Matrix rings, 674, 705
  quantifiers, 86-100, 103-116                Matches, 411                                  Matrix sum, A-12
  Rules of Inference, 67-84, 86               Matching, 659, 660, 667, 668                  Maurocylus, Francesco, 244
  statements (and connectives), 47-49         Matching theory, 659-668                      Max, 240
  Substitution Rules, 60, 61                    assignment problem, 659                     Max-Flow Min-Cut problem, 649
  Table of Rules of Inference, 78               complete matching, 660-664                  Max-Flow Min-Cut Theorem, 649, 652
  truth tables, 49, 52, 53, 56-58, 60-62        deficiency of a graph, 664                  Maximal biconnected subgraph, 615
Logic gate, 719                                 deficiency of a set of vertices, 664        Maximal chain, 381
Logic network, 719, 720                         6(G), 664, 665                              Maximal element (of a poset), 362
Logical connectives; see Basic                  Hall’s marriage condition, 664              Maximal flow, 645, 647
   connectives                                  matching, 659, 660, 667, 668                Maximal independent set of vertices,
Logical equivalence, 56-61, 83, 95               maximal matching, 664, 665, 668               564, 627
Logical implication, 69-73, 75, 89, 91,          one-factor, 666                            Maximal matching, 664, 665, 668
  95                                             perfect matching, 666                      Maxterm, 717, 718, 727
Logically equivalent open statements, 92         system of distinct representatives, 663,   Maybee, John S., 575
Logically equivalent statements, 56, 58,            668                                     Mazur, Barry, 706
   61-64, 74, 91, 97, 98                      The Mathematical Analysts of Logic,           McAllister, David F., 333, 334, 378, 507,
Logically implies, 69, 92                        Being an Essay towards a Calculus of          508
London Mathematical Society, 565                 Deductive Reasoning, 118                   McCluskey, Jr., Edward J., 742, 743
Long division of polynomials, 803             Mathematical axioms, 113                      McCoy, Neal H., 707, 708, 831, 832
                                                                                                                Index        1-15

Mealy, George H., 333, 334                Minimization process algorithm, 372,         Multiplicity of a root, 805
Mealy machine, 333                          373                                        Multiplier, 689
Mean, 177, 180, 183                       Minimum capacity, 647, 648, 652              Multiset, 518
Mean value, 177                           Minimum cut, 648, 649, 652-654, 656          Murty, U. S. R., 573, 575, 668
Measure of central tendency, 177          Minimum distance between code words,         Mutually disjoint events, 159
Measure of dispersion, 177                  767-769, 771, 773, 774                     Mutually disjoint sets, 137, 148
Member (ofa set), 123                     Minimum weight, 777                          Mutually independent in pairs, 172
Membership, 123                           Minimum weight of nonzero code words,
Membership tables, 143, 144, 146            774                                        N, 133
Mémoire sur les conditions de             Minor, A-19                                  ("), 15, 21, 41, 42, 436
  résolubilité des équations par          Minsky, Marvin, 333, 334                      (|"),n > 0, 422
  radicaux, 794                           Minterm, 716, 717, 726, 732, 738
Memory, 5                                 Mirsky, Leon, 668, 669                       (,, nymyoum)?     23

Mitchell, Margaret, 47, 48, 52               nt, 6,215
Memory cell, 5, 225
                                          Mixed strategy, 768                          nt, Stirling’s approximation formula, 304
Memory location, 369, 378, 694
                                          Mobius inversion formula, 412                n choose r, 15
Mendelson, Elliott, 119, 120
                                          Mod, 234, 454, 689-695, 701, 702, 759,       n factorial, 6
Merge algorithm, 608
                                            760                                        n-butane, 584
Merge sort, 605-609,    641
                                          Mod n, 702                                   n-cube, 532
Merge sort algorithm, 496, 608
                                          Modular congruence, 694, 707                 n-dimensional hypercube, 532
Merging process, 607
                                          Modular exponentiation, 693, 694, 760        n-fold product, 248
Mesh graph, 532
                                          Modulo    relation, 337                      (n, m) block code, 764; see also
Messages, 763, 769, 777, 778; see also                                                    Algebraic coding theory
  Algebraic coding theory                 Modus ponens, 70, 73-75, 78, 108, 109
                                          Modus tollens, 73, 74, 76, 78, 108, 109      n-tuple, 248
Methane, 792
                                          Moment generating function, 443, 444         v, 320
Method of affirming, 70, 71
                                          Monary operation, 138, 267, 733              Nand (connective), 66
Method of contradiction,      114, 115
                                          Monic polynomial, 807                        NAND gate, 727, 728
Method of contraposition, 115
                                          Monma, C. L., 562, 575, 576                  Napier, John, A-6
Method of denying, 73
                                          Monotone increasing function, 259, 494,      Natural logarithm, 284, A-6
Method of exhaustion, 106
                                            495, 500, 501, 503, 608, 609               Natural numbers, 133
Method of generating functions, 482-487                                                Natural position, 402
                                          Montgomery, Hugh L., 243, 244, 444,
Method of infinite descent, 244
                                            445, 708                                   Nazi cipher, 333
Method of proof, 193                                                                   Neal, David, 444
                                          de Montmort, Pierre Remond, 411
Method of proof by contradiction, 137                                                  Nearest neighbor, 771
                                          Moon, John Wesley, 623, 625
Method of recursion, 211                                                               Necessary and sufficient, 48
                                          Moore, Edward Forrest, 333, 334, 378
Method   of undetermined coefficients,                                                 Necessary condition, 48
                                          Morash, Ronald P., 119, 120
  47]                                                                                  Negation, 48
                                          Moser, L., 493, 507
Methods of proof, 125                     Mountain ranges, 494                         Negation of quantified statements, 92.
Methodus Differentialts, 303                                                              96. 97
                                          px, 177
Metric, 767                               Multigraph, 349, 516, 518, 533, 631          Negation (logic gate), 719
Metric space, 767                                                                      Negative, 138
                                          Multinomial coefficient, 23
Microcontroller, 5                                                                     Negative integers, 227
                                          Multinomial Theorem, 23, 106
Microsoft, Inc., 117, 156, 278            Multiple                                     Nemhauser, G. L., 562, 575, 576
Miksa, F. L., 493                            of an integer, 221                        Nested multiplication method, 301
Millbanke, Annabella, 242                    of a polynomial, 802                      Network
Miller, George Abraham, 795               Multiple errors, 763                            dual, 551-553
Min, 240                                  Multiple output network, 720                    electric power, 666
Minimal covering of a graph, 577          Multiples of group elements, 748                electrical, 551, 552, 573, 574, 581, 622
Minimal disconnecting set of edges, 550   Multiplication of equivalence classes           gating, 309, 719-722, 731
Minimal distinguishing string, 375           of integers (in Z,,), 687                    logic, 719, 720
Minimal dominating set, 577, 730             of polynomials, 809                          multiple output, 720, 721
Minimal element (of a poset), 362         Multiplication of matrices, A-14                parallel, 64
Minimal machine, 373                      Multiplication of polynomials, 800              PERT, 357, 377
Minimal number of states, 327             Multiplicative cancellation in Z, 221           Program Evaluation and Review
Minimal product of sums, 727, 742         Multiplicative identity for matrices, A-16         Technique, 357, 377
Minimal realization of a finite state     Multiplicative identity for real numbers,       series, 65
  machine, 371, 372                          103                                          switching, 64-66
Minimal spanning tree, 639, 667, 668      Multiplicative identity in a ring, 675          transport, 644-658
Minimal spanning tree algorithms,         Multiplicative inverse (of a nonzero real    Network interface, 12
  639-643, 668                               number), 103, 278                         Network number, 12
Minimal sum of products, 721-725,         Multiplicative inverse in a ring, 677, 681   Neumann, Peter M., 796, 797
   729-733, 742                           Multiplicative inverse for a matrix, A-16    Neutrons, 486
Minimal weight edge, 640                  Multiplicative rule, 168, 172                New York Times, 707
Minimization process, 337, 371-376,       Multiplicity of a characteristic root, 468   Newsom, Carroll V., 119, 120, 304, 305
  378, 742                                Multiplicity of an edge, 518                 Newton, Sir Isaac, 303
1-16              Index

Next state, 320                             Officers, 819                                Ore, Oystein, 561, 668, 669
Next state function, 320, 682               Ohm’s Law, 573                               Organic compounds, 791-793
Nicomachus of Gerasa, 707                   Ohm’s   Law   for electrical flow, 573       Origin (of an edge), 349, 514
Nievergelt, Jurg, 506, 508                  w (output function), 320                     Orlin, James B., 562, 575, 638, 643, 654,
Nilsson, Nils J., 119, 120                  w(G), 578                                       668
Nine-times repetition code, 773; see also   On the Theory of Groups, as Depending        Orthogonal Latin squares, 816-818, 831
   Algebraic coding theory                    on the Symbolic Equation 6” = 1,794        Out degree ofa vertex, 535, 588, 644
Niven, Ivan, 243, 244, 444, 445, 708        One element of a Boolean algebra, 733        Outcome, 150, 151, 154, 155, 158, 175,
No degree, 800                              14,279                                          177, 178
Nobel Prize, 187                            One-dimensional array, 254                   Outgoing degree of a vertex, 535
Node, 349, 514; see also Vertex, Vertices   ]-equivalence, 338                           Output (from an algorithm), 253, 289
Noether, Emmy,      706, 707                ]-equivalent states (s) E; 52), 371          Output (for a finite state machine), 309,
Noise (in a binary symmetric channel),      One factor, 666                                 319
  761                                       One-terminal-pair-graph, 552                 Output (from a gate), 720
Nonabelian group, 749                       One-to-one correspondence, 279, 303,         Output alphabet, 320, 321
Nonadjacent vertices, 561                      370, 427, 428, 435, 526, 551, 660,        Output function, 320, 682
Noncommutative operation, 590                 A-23-A-27                                  Output string, 321, 322
Noncommutative ring, 675, 705               One-to-one function, 255-257, 279, 280,      Overcounting, 19, 20, 411
Nonempty universe, 89                          409, 410                                  Overflow error, 229
Nonequivalent configurations, 783, 784,     One-unit delay machine, 329
  790, 791                                  One’s complement, 227-229                    P(G, 2), 566-568, 570
Nonequivalent seating arrangements, 784     Onto function, 260-265, 287, 288, 392,       p(m,n), the number of partitions of m
Nonequivalent states, 374                      411, 439, 682, 699, 739                      into exactly n positive summands, 444
Non-Euclidean geometry, 820                 Open contact, 551, 553                       p(n), the number of partitions of n, 432,
Nonexecutable specification statement,      Open interval, 99, 100, 134, 164                443
   369                                      Open statement, &6, 87, 89-92, 105, 106,     P(n,r), 7, 15, 41, 436
Nonhomogeneous recurrence relation,            109, 123, 126, 194, 195                   p is sufficient for q, 48
  450, 451, 456, 470-481                    Open switch, 64, 551, 553                    Pp logically implies g, 69
Nonlinear recurrence relations, 449         Open trail, 534                              Pair of orthogonal Latin squares,
Nonnegative integers, 133                   Open walk, 515, 516                             816-819, 823
Nonplanar graph, 540, 541, 543              Operand, 136                                 Pairs of rabbits, 505
Nontaking kings, 510                        Operation, 136                               Pairwise disjoint subboards, 405, 408
Nontaking rooks, 404, 407                   Operations research, 574, 631, 667           Pairwise incidence matrix, 826
Nontrivial subgroup, 748                    Optimal prefix code, 613                     Palindrome, 13, 174, 197. 319, 425, 426,
Nonzero complex numbers, 134                Optimal spanning tree, 638, 639, 642            431, 432, 460, 461, 469
Nonzero division, 221, 356                  Optimal tree, 612, 613, 640-642              Palmer, Edgar M., 574, 576
Nonzero rational numbers, 133               Optimization, 41, 324, 562, 581, 631         Pan balance, 602, 603
Nonzero real numbers, 134                   Or (connective), 48                          Papadimitriou, Christos H., 333, 334
Nor (connective), 66                        Or (exclusive), 48, 56                       Parallel algorithm, 531
NOR gate, 727, 728                          OR gate, 719-721                             Parallel classes, 822-824,   828
Normal   subgroup, 795, 831                 Order, 6, 14, 15, 30,   125,   130           Parallel computer, 531
Not p, 48                                   Order for the vertices of a tree, 592, 593   Parallel lines, 822, 827, 828
Not, .. and (connective), 66                Order g (or, Order of g), O(g), 290-292      Parallel network, 64
Not... or (connective), 66                  Order at least, 293                          Parent, 588, 593, 597, 613
Null child, 594, 595                        Order for a Boolean algebra, 736             Parenthesize an expression, 38, 39, 490,
Null graph, 523                             Order for functions, 290, 292, 293              494
Null set (@), 127                           Order in a tree, 588, 589                    Parity checks, 778
Number of divisions, 458, 459               Order of a finite field, 812, 813            Parity-check code, 764, 765; see also
Number of positive divisors, 239            Order of a group, 746                           Algebraic coding theory
Number theory, 29, 188, 222, 242-244,       Order of a group element, 754                Parity-check equations, 770, 777, 778,
  303, 304, 394, 411, 412, 432, 442, 673,   Order of a linear recurrence relation, 456     see also Algebraic coding theory
  705, 706                                  Order of quantifiers, 98                     Parity-check matrix, 772, 774, 776-779;
Numerical   analysis, 304                   Order-preserving function, 366, 509             see also Algebraic coding theory
                                            Ordered array, 501-503                       Parker, Ernest Tilden, 819, 831
O(g) (order of g), 290                      Ordered binary tree, 488                     Partial breadth-first spanning tree, 656
O(g) on S, 498                              Ordered pair, 152, 176, 248, 252, 253,       Partial fraction decomposition, 426, 483,
Q(g), 293                                      282, 284                                     485
Object program, 253, 302                    Ordered rooted tree, 588                     Partial function, 260
O’Bryant, Kevin, 623, 624                   Ordered set, 129, A-25                       Partial order, 337, 341-343, 356-364,
Octahedron, 548                             Ordered sum, 205                                376, 377, 476, 533, 737, 738; see also
Octal system (base 8), 225                  Ordered tree, 594                               Poset
od(v), 535                                  Ordered triples, 827                         Partial order for a Boolean algebra,
od(z), 644                                  Orderly permutation, 455                        736-738
Odd integer, 113, 218                       Ordinary generating function, 436, 440,      Partial ordering relation, 357; see also
Odd-degree vertices, 53]                       443, 444; see also Generating function       Partial order, Poset
                                                                                                                 Index                  1-17

Partial semipath, 652                        Polynomial in the indeterminate x, 799      Primary key, 272
Partially ordered set, 357, 377; see also    Polynomial order, 293                       Prime characteristics, 812
   Poset                                     Polynomial ring, 801, 830                   Prime integer (or number), 116, 193, 221,
Particular solution, 471, 475, 479, 482      Polynomial time complexity, 293                222, 230, 237, 238
Partition, 366-375, 377, 378                 Pop, 490-493                                Prime factorization, 238, 240
Partitions of integers, 29, 31, 432-435,     Poset, 357-364                              Prime order (for a group), 758
   443, 444                                     antichain, 38]                           Prime polynomial, 807
Pascal, Blaise, 42, 188, 244                   chain, 381                                Primitive statement, 48
Pascal’s triangle, 133, 135, 188               glb (greatest lower bound), 363           Princeton University, 706
Patashnik, Oren, 304, 305, 506, 507            greatest element, 363                     Principia Mathematica, 119, 187
Path (in a graph), 351, 516, 517, 556, 582     greatest lower bound (gib), 363           Principle of cross classification, 411
Path (staircase), 9, 36-38, 130, 132           Hasse diagram, 358-361                    Principle of duality
Pattern, 124                                   lattice, 364                                 for a Boolean algebra, 735
Pattern inventory, 783, 789-793                least element, 363                           for Boolean functions, 713
Pawlak, Zdzislaw, 623                          least upper bound (lub), 363                 for Boolean variables, 713
Peacock, George, 186                           length of a chain, 381                       for logic, 59
Peano, Giuseppe, 188, 243, 377                 lower bound, 363                             for set theory, 141
Peano’s postulates, 243                        lub (least upper bound), 363              Principle of inclusion and exclusion, 261,
Pegs, 472, 473                                 maximal chain, 381                           385, 389-397, 402, 407, 411, 412,
Peile, Robert E., 796                          maximal element, 362                         415, 659
Peirce, Charles Sanders, 119, 377              minimal element, 362                      Principle of Mathematical Induction,
Pendant vertex, 533, 549, 583, 5%4             order-preserving function, 366                194-196, 198, 200-206, 213-216,
Pennies, 462, 495                              topological sorting algorithm, 360,          218, 244, 315, 317, 390, 425, 441, 448,
Pentium processor, 5                               361, 363                                 468, 469, 599: see also Mathematical
Perfect, H., 668, 669                          total order, 359-361                         induction, The alternative form of the
Perfect integer, 24]                           upper bound, 363                             Principle of Mathematical Induction
Perfect matching, 666                        Positive closure of a language, 315         Principle of Strong Mathematical
Perfect square, 90, 239                      Positive integers, 133, 136, 193               Induction, 206
Perl, 4                                      Positive rational numbers, 133              Private-key cryptosystem, 693, 759, 760
Permutation, 6-8, 14, 15, 41, 42, 217,       Positive real numbers, 134                  Probability, 3, 42, 123, 150-189, 247,
   220, 393, 394, 403, 408, 411, 436, 452,   Postorder (traversal), 592-595, 623, 628       262, 402, 408, 409, 411, 428, 430, 468,
   453, 490-492, 495, 506: see also          Postulates, 87, 98, 243                        506, 759-765
   Arrangement                               Power series, 417, 418, 433, 443, 484          Additive rule, 162, 168, 172
Permutation group, 749, 750, 781, 782,       Power set, 128, 476, 533                      Axioms of probability, 159, 161
   830; see also Group theory                PowerBall, 15                                 Bayes’ Theorem, 170, 173, 188
Permutation matrix, 670                      Powers of                                     Bernoulli trial, 161, 178, 179
PERT network, 357, 377                          an alphabet, 310                           Binomial probability distribution, 179
Petersen, Julius Peter Christian, 574           a function, 282                            Chebyshev’s Inequality, 183, 184, 188
Petersen graph, 543, 566, 574                   a group element, 747                       conditional probability, 166-173
Peterson, Gerald R., 742, 743                  a Janguage 315                              continuous sample space, 164
@ (the null set), 127                          a real number, A-2                          discrete sample space, 164, 175
¢() (Euler’s phi function), 394, 395,          a relation, 345                             E(X), 177, 182, 183
   689, 692, 747, 759, 760                     a ring element, 802                         elementary event, 158
Pi notation, 239                               z, 310                                      event, 151, 158, 159
II notation, 239                               strings, 312                                expectation,   177
Pigeonhole principle, 273-278, 287, 288,     Pr(B|A), 167                                  expected value, 177, 179, 180
   303-305, 327, 328, 796                    Precedence graph, 350                         experiment,    150-153
Plaintext, 690-692, 760                      Precedes, 347, 348                            independent events, 155, 158, 161, 170
Planar graph, 540-553, 573                   Precise instructions, 233                     independent outcome, 154, 166, 170,
Pianar-one-terminal-pair-graph, 552          Pred (predecessor) function, 307                 174
Planarity of graphs, 352, 615; see also      Prefix, 312, 313, 315, 338                    Kolmogorov, Andrei, 159
   Planar graph                              Prefix codes, 609, 611, 613, 614, 624         Law of total probability, 169, 170, 173
Platonic solids, 547, 548, 556               Prefix notation, 591                          mean, 177, 180, 183
Pless, Vera, 796, 797                        Pregel River, 533                             jx, 177
Points at infinity, 828                      Preimage (of an element), 253                 Multiplicative Rule, 168, 172
Polaris submarine, 357                       Preimage (of a set), 285, 286                 mutually disjoint events, 159
Polish notation, 591, 592                    Premise, 53, 67, 70, 107, 109-111             outcome, 180, 181, 183
Polya, George, 623, 625, 745, 796, 797       Preorder (traversal), 592-597, 616, 619,      random variable, 175-184
Polya’s Method of Enumeration, 623,             620, 623                                   Rule of complement, 159, 172
   779, 789, 891                             Prescribed order, 597, 599, 620, 653, 655     sample space, 150-155
Polya’s theory in graphical enumeration,     Preservation of Boolean algebra               ox, 180
   574                                          operations, 739                            standard deviation,      180,   182,   183
Polyhedra, 573                               Preservation of ring operations, 698           variance, 177, 180
Polynomial equation, 794                     Prim, Robert Clay, 638, 641, 667, 669       Probability distribution, 176, 177, 179,
Polynomial evaluation algorithm, 301         Prim’s algorithm, 641-643, 653                 180, 428, 430
1-18             Index

Probability rules and laws, 172                  Push, 490-493                               Rearrangement, 749
Problem of the 36 officers, 819, 831             Puzo, Mario, 186, 692                       Reasoning system, 86
Le Probléme des rencontres, 411                                                              Rebman, Kenneth R., 304, 305
Procedure for the Euclidean algorithm,           Q, Q*, Q*, 133                              Received word, 762, 763, 777; see also
   234                                           Qn, 532, 542, 667                              Algebraic coding theory
Procedure for integer division, 224              q is necessary for p, 48                    Record, 694
Proceedings of the Royal Geographical            Quadratic equation, 794                     Recurrence relations, 447-510
   Society, 565                                  Quadratic order, 293                          analysis of algorithms, 473
Product of Boolean functions, 712                Quadratic time complexity, 299                associated homogeneous relation,
Product of disjoint cycles, 780, 781, 786        Quantified open statement, 87                    471473, 479, 480
Product of matrices, A-14                        Quantifiers, 91, 98, 103-105, 119, 125,       boundary conditions, 448
Product of maxterms, 717, 718                       146, 195, 291                              characteristic equation, 456
Product of sets, 248                                bound variable, 88                         characteristic roots, 456, 468
Program Evaluation and Review                      connectives, 88, 89                         constant coefficients, 448
   Technique (PERT) network, 357, 377              existential quantifier, 87, 88              Fibonacci relation, 457
Program verification, 203                          Vx, 88                                      first-order linear relation, 448
Projection, 270-272                                tree variable, 88                           general solution, 456, 468, 471
Projective plane, 827, 828                         implicit, 89. 90                            geometric progression, 447
Proof,   10, 47, 84, 103,   104, 119; see also     dx, 88                                      homogeneous relation, 448, 450, 456
   Rules of inference, Mathematical                universal quantifier, 87, 88                initial condition, 448, 456
   induction, Mathematical                       Quantify, 87                                  linear relation, 449
   induction—alternative form                    Quantum theory, A-11                          linearly independent solutions, 456,
Proof by contradiction, 76, 77, 80, 84,          Quartic equation, 794                            464
   99, 114, 115, 127, 137, 237, 273, 291         Quasi-path, 650                               Maple code, 477
Proper coloring of a graph, 565-568, 570         Quaternary alphabet, 474                      method of generating functions,
Proper divisors of 0, 221                        Quaternary relation, 271                          482-487
Proper divisors of zero, 675, 677, 689,          Quaternary sequence, 247                       method of undetermined coefficients,
   801                                           Queue, 598, 599                                   47]
Proper prefix, 312, 315                          Quick sort, 609                               nonhomogeneous relation, 471, 472
Proper subgroup, 748                             Quine, Willard Van Orman, 742, 743            nonlinear relation, 487-493
Proper subset, 124-126                           Quine-McCluskey method, 727, 742              particular solution, 471, 475, 479, 482,
Proper substring, 313                            Quintic equation, 794, 830                        487-493
                                                 Quintilianus, Marcus Fabius, 705               second-order linear relations, 456-468
Proper suffix, 312
                                                 Quotient, 221, 223, 224                        system of recurrence relations, 486,
Properties of a Boolean Algebra, 735,
                                                 Quotient group, 831                               487
   736
                                                                                               Table of particular solutions for the
Properties of a group, 747
                                                 r(C, x), 404406, 408                              method of undetermined
Properties of exponents, A-3
                                                 Vek   (C),   405,   406                           coefficients, 479
Properties of the integers, 193-246
                                                 R, R*, R*, 133,134                             variable coefficients, 452
  Division algorithm, 223
                                                 R[x], 799                                   Recurrent event, 506
  Euclidean algorithm, 232, 233
                                                 R* (converse of relation RK), 282           Recursion, 211
  Fundamental Theorem of Arithmetic,
                                                 Rabbits, 505                                Recursive algorithm, 453
     238
                                                 Radicals, 830                               Recursive algorithm for the Fibonacci
  greatest common divisor, 231, 232
                                                 Ralston, Anthony, 537, 575, 576                numbers, 477, 478
  least common multiple, 236
                                                 Ramsey, Frank Plumpton,        305          Recursive construction,   129, 447, 532,
  mathematical induction, 193-208
                                                 Ramsey theory, 305                             620
  primes, 193, 221, 222, 230, 237, 238                                                       Recursive definition, 210-218, 251, 255,
                                                 Random variable, 175-184, 209, 296,
   Well-Ordering Principle, 194                    428, 430                                     282, 312, 317, 447, 594, A-1, A-26
Properties of logarithms, A-6                    Random walk, 506                            Recursive function, 259, 453
Proposition, 47; see alse Statement
                                                 Randomly generated numbers, 689             Recursive   method, 454
Propositional calculus, 735                      Range, 175, 253, 392, 393                   Recursive procedure, 453, 500, 606, 608
Priifer code, 586, 587                           Rank, 559, 819                              Recursive process, 211-213, 218, 316
Prune (a tree), 596                              Rate of a code, 764, 778; see also          Recursively defined set, 218, 251, 316
Pruned tree, 611                                    Algebraic coding theory                  Reddy, M. R., 562, 575
Pseudocode procedure for                         Ratio test, 429                             Redei, L., 559
   binary search, 502                            Rational number exponent, A-2               Redfield, J. Howard, 796, 797
   bubbiesort, 450                               Rational numbers, 133, 194, A-30            Reducible polynomial, 807
   Euclidean algorithm, 234                      Reachability, 338                           Reductio ad Absurdum, 76, 127
   exponentiation, 297                           Reachable state, 330                        Redundant state, 371, 373
  Fibonacci numbers, 477-479                     Reactor, 486                                Reed, M. B., 574, 576
   gcd (recursive), 455                          Read, R. C., 566, 571, 574, 576, 796, 797   Refinement (of a partition), 373
  hinear search, 296                             Real numbers,       133, 139, 194, A-27,    Reflections, 750, 781, 782, 788
   modular exponentiation, 693                     A-28, A-30                                Reflexive property (of a relation),
Pseudorandom numbers, 689                        Real-valued function, 99                       337-343, 347, 348, 353, 366-369,
Public-key cryptosystem, 759, 760                Rear (of a list), 598, 599                     376, 377, 782, 808
                                                                                                                      Index          I-19

Regiments,   819                                    Reverse order, 620                          Roman, Steven, 769, 796, 797, 831, 832
Region, 544                                         Ribet, Kenneth, 706                         Rook, 404, 407
Regular graph, 531                                  Right branch, 488                           Rook polynomial, 404—406, 408, 410,
Reinelt, G., 562, 576                               Right-cancellation property (in a group),      412, 416, 510, 659
Reingold, Edward Martin, 506, 508                      747                                      Root extraction, 794
Relation, 211, 247, 250-256, 271, 282,              Right child, 590, 594, 610, 611             Root of a binary ordered tree, 488
   303, 337-378, 513, 737, 780-783                  Right children, 594, 595                    Root ofa polynomial, 802, 804-806
   antisymmetric relation, 340, 341, 347,           Right coset, 757                            Root of a tree, 587-590
      348, 353, 357, 358, 376, 377                  Right subtree, 592, 594-596                 Root of multiplicity 2, 805
  associative law of composition, 345               Rigid motions                               Rooted binary tree, 488, 594
  binary relation, 250, 337                            of a cube, 791                           Rooted Fibonacci tree, 626
  composite relation, 344                              of an equilateral triangle, 749, 750     Rooted ordered binary tree, 488, 489,
  converse of a relation, 282                         of a regular hexagon, 788                    506, 596
  definition of a relation, 250                       of a regular tetrahedron, 792, 793        Rooted tree, 587-596, 600, 601
  divides relation, 339, 737                          of a square, 750, 780                     Rorres, Chris, A-21
  equivalence relation, 337, 342, 343,              Rinaldi, G., 562, 576                       Rosen, Kenneth H., 42, 244, 704, 708
     353, 366-378                                   Ring, 673, 674                              Ross, Kenneth A., 119, 120
  equivalent states, 338, 371                       Ring homomorphism, 697-700, 706             Rota, Gian Carlo, 412, 444, 445
  first level of reachability, 338                  Ring isomorphism, 697-704                   Rotating drum, 536
  irreflexive relation, 344                         Ring of matrices, 674, 705                  Rotations, 10, 749, 781, 782, 788, 791,
  k-equivalent states, 338, 371, 372                Ring of polynomials, 799, 801                  792
  modulo n relation, 337                            Ring theory, 673-709                        Rothman, Tony, 831, 832
  1-equivalence, 338                                  Boolean ring, 709                         Rothschild, Bruce L., 305
  partial order, 337, 341-343                         cancellation laws of addition, 680        Roulette, 163, 164
  partial ordering relation, 341, 357                 cancellation law of multiplication,       Round-robin tournament, 559
  poset, 357-364                                         678, 681                               Rouvray, Dennis H., 574, 576
  powers of a relation, 345                           center of a ring, 709                     Row major implementation, 254
  reachability, 338                                   characteristic, 812                       Row matrix, A-11
   reflexive relation, 337-343, 366-369               commutative ring, 675                     Row number, 716
   relation composition, 344                          congruence modulo n, 689, 690             Row vector, A-11
   relation matrix, 346-349                           definition, 673, 674                      Royal flush, 152
   second level of reachability, 338                  field, 677, 678, 681, 682                 RSA Cryptosystem, 759-761, 795
   subset relation, 250, 340, 358, 359                group of units, 747                       Ruin problems, 506
   symmetric relation, 339-343                        homomorphism, 697-700, 706                Rule for Proof by Cases, 78
   transitive relation, 339-343                       ideal, 684, 700, 706                      Rule of Complement,    159, 172
   zero-one matrix, 344, 347                          idempotent element, 697                   Rule of Conditional Proof, 78
Relation composition, 344                             integers modulo x, 686-696                Rule of Conjunction, 75, 78
Relation matrix, 346-349                              integral domain, 677, 678, 681, 682       Rule of Conjunctive Simplification, 78,
Relational data base, 271, 272, 305                   isomorphic rings, 698, 699, 704              94, 137
Relative complement, 138                              isomorphism, 698                          Rule of the Constructive Dilemma, 78
Relative frequency,       158, 159                    kernel of a homomorphism, 704             Rule of Contradiction, 76, 78
Relatively prime integers, 232-234, 236,              matrix rings, 674, 705                    Rule of the Destructive Dilemma, 78
   240, 394, 470                                      multiplicative identity, 675              Rule of Detachment, 70, 71, 78, 108
Relatively prime polynomials, 808                     multiplicative inverse, 677               Rule of Disjunctive Amplification, 78,
Relativity theory, 707                                proper divisors of zero, 675                 137
Remainder, 223-226, 234, 274, 276, 686,               ring of matrices, 674, 705                Rule of Disjunctive Syllogism, 75, 78
   689, 693, 804, 805, 810                            ring of polynomials, 799, 801             Rule of Existential Generalization, 117
Remainder Theorem, 804, 805                           ring properties, 679-684                  Rule of Existential Specification, 117
Rencontre, 403                                        ring with unity, 675                      Rule of Product, 4-7, 11, 14-19, 28, 29,
Rényi, A., 587                                        subring, 682-684                             34, 125, 142, 197, 239, 248, 255, 256,
Repeated real roots, 467                              subtraction, 680                             261, 274, 339, 341, 342, 403, 567
Repetition, 7, 26, 27, 41, 125, 149                   unit, 677                                 Rule of Sum, 3-5, 16, 19, 34, 125, 132,
Replacement,    92, 96,    106,   114,   115, 124     unity, 675                                   148, 262, 264, 274
Replication number, 825                               Zn. 686                                   Rule of Universal Generalization,
Representation Theorem for a finite                 Ring with unity, 675, 801                      110-114,   126
   Boolean algebra, 738-741, 743                    Ringel, Gerhard, 573, 576                   Rule of Universal Specification,
Resek, Diane, 119, 120                              Rings of Saturn, 42                            106-113, 126
Reset, 321                                          Rinnooy Kan, A. H. G., 562, 575, 576        Rules of Inference, 70-78, 83, 84, 86,
Residue arithmetic, 704                             Riordan, John, 412, 444, 445                   107-109, 112, 113, 117, 119
Resolution, 86, 119                                 Rise/fall permutation, 495, 496                Law of the syllogism, 72, 73, 78
Resolvent, 86                                       Rivest, Ronald L., 504, 507, 624, 625,         Modus ponens, 70, 73, 75, 78
Restriction of a function, 257                        638, 643, 654, 668, 759                      Modus tollens, 73, 76, 78
Retransmission, 765, 769                            Roberts, Fred S., 42, 574, 576                 Proof by (the method of)
Reversal function, 318                              Robertson, N., 575, 576                           contradiction, 76, 80, 84
Reversal of a string, 317, 319                      Robinson, J, A., 86                            Reductio ad Absurdum, 76
1-20              Index

Rule   for proof by cases, 78              Self-complementary graph, 529, 576             Seven Bridges of Kénigsberg, 378, 513,
  Rule   of conditional proof, 78            Self-dual, 735                                    518, 533-535, 573
  Rule   of conjunction, 75, 78              Self-dual Boolean function, 744                Seyffarth, Karen, 412
  Rule   of conjunctive simplification, 78   Self-orthogonal Latin square, 820              Seymour, P. D., 575, 576
  Rule   of detachment, 70, 78               Semicircles, 40                                Shamir, Adi, 759
  Rule of disjunctive amplification, 78      Semipath, 650-653                              Shannon, Claude Elwood, 741-743, 761,
   Rule of disjunctive syllogism, 75, 78     Sentences, 47, 48, 310                            795, 797
   Rule of the constructive dilemma, 78      Separation, 645                                Sherbert, Donald R., 506, 508
   Rule of the destructive dilemma, 78       Separation property, 646                       Shier, Douglas R., 623, 625, 668, 669
   Rule of universal generalization,         Sequence, 255                                  Shift, 690
       110-114, 126                          Sequence of pseudorandom numbers,              Shift cipher, 691
   Rule of universal specification,             689, 690                                    Shi-kie, Chu, 188
       106-113, 126                          Sequence recognizer, 326, 327, 332             Shimura, Goro, 706
   Table of rules of inference, 78           Sequential circuit, 309; see also Finite       Shmoys, D. B., 562, 575, 576
Rules for negating quantified statements,      state machine                                Shortest-Path algorithm, 667
   96                                        Serial binary adder, 323, 324                  Shrikhande, S. S., 819, 831
Run, 33, 34, 192, 482                        Series network, 65                             Shushu jiuzhang, 707
Running time, 452                            Seshu, S., 574, 576                            Siblings, 588, 593, 612
Russell, Lord Bertrand Arthur William,       Set braces, 123, 124                           Sichuan, 707
    119, 135, 187                            Set equality, 125, 126, 252, 314               Sieve method, 41]
Russell’s paradox, 135, 186, 187             Set of all possible outcomes, 150              ox (standard deviation), 180, 182, 183
Rydell High School, 16                       Set of indices,     145                        oF (variance), 180
Ryser, Herbert John, 42, 412, 668, 669,      Set theory, 87, 119, 123-129, 133-155,         r?, Z, 2", 309
   $31, 832                                     211, 247, 303, 304, 309, 311, 313           xt, x*, 310
                                                cardinality, 124, 186, A-23                 Sigma notation, 17, 239
s(m, n), 267                                    complement of a set, 138                    = notation, 17
S3, Sq, 750                                     countable set, 164, 303, A-24-A-32          Signals, 438, 439, 761
Sn, 787, 789, 794, 830                          denumerable set, 303, A-24                  Signed numbers, 681, 705
S(m, n), 263                                    disjoint sets, 137, 148                     Silvestri, Richard, 795, 797
S(x, k), 767                                    element, 123                                Simple group, 795
Saaty, Thomas L., 668                           element argument, 126, 137, 140, 144        Simultaneous solution of a system of
Sahni, Sartaj, 641, 642, 668, 669               empty set (4), 127                             congruences, 702
Same cardinality (for sets), A-23              equality of sets, 125, 126                   Singleton subset, 128
Same likelihood, 150, 158                      finite set, 124, 125,      186, A-23, A-24   Sink
Same size (for sets), A-23                     generalized intersection of sets, 146           in a finite state machine, 331
Sample   space,   150-155,   157-159,          generalized union of sets, 146                  in a transport network, 631
   161-164, 166-172, 175-181, 183,             infinite set, 124, 186, 189, A-23-A-26,      Sink state, 331
  262, 296, 402, 409, 428                         A-28, A-30                                Size of a set, 124, A-23
Samuel, Pierre, 707, 708, 831, 832             intersection of sets, 136, 138               Sloane, Neil James Alexander, 795, 797
Sanders, D. P., 575, 576                       intuitive definition of a set, 123           Smallest element, 194
Sandler, R., 831                               laws of set theory,       139, 144           Smith, Henry John Stephen, 243
Saturated edge, 645, 649, 650                  member, 123                                  Snowflake curve, 475
Saturated hydrocarbons, 573, 581, 584          membership, 126                              Software development, 203
Saturn, 42                                     membership table, 143, 144, 146              Soifer, Alexander, 304, 305
Scalar product, A-12—A-14                      mutually disjoint sets, 137, 148             Solow, Daniel, 119, 120
Scattering function, 694, 708                  null set (B), 127                            Solution of polynomial equations, 830
Schedule, 815, 825                             power set, 128                               Solvable group, 830
Scholtz, Robert A., 796                        Principle of Duality, 141                    Sorting, 506, 581, 605, 606, 608
Schréder, Ernst, 119, 377                      proper subset, 124-126                       Sorting technique, 450
Schréder numbers, 495                          relative complement,         138             Source in a network, 631
Schwenk, Allen J., 628                         set braces, 123, 124                         Source of a directed edge, 349, 514
Scientific American, 575                       set of indices,     145                      Source program, 253, 302
Searching (algorithm), 501                     singleton subset, 128                        Space (blank), 310
Searching process, 295, 501                    size of a set, 124, A-23                     Space complexity function, 290
Second level of reachability, 338              subset, 124-128, 130-132, 138, 140,          Spanning forest, 582
Second-order homogeneous recurrence               141, 149                                  Spanning subgraph, 521, 582, 640
   relations, 456-468                          superset,   138                              Spanning tree, 582, 596, 597, 599, 631,
Security, 693                                  symmetric difference, 136                      638, 640
Seed, 689                                      uncountable set, 303, A-28                   Specification statement, 369
Sefer Yetzirah (The Book of Creation}, 41      union of sets, 136, 138                      Spencer, Joel H., 305
Selection, 14-16, 19-22, 26; see also          universe,   123-128                          Sphere S(x, k), 767; see also Algebraic
   Combination                                  universe of discourse, 123-128                coding theory
Selection structure, 51                         Venn diagram, 141-144, 146, 148, 155        Spine (of a caterpillar), 627, 628
Selection with repetition, 26-29, 32, 415,      well-ordered set, 194                       Split input (for a gating network), 720
   423, 485                                  Set theory of strings, 309                     Spokes (in the wheel graph), 519, 520
                                                                                                                  Index         1-21

Square matrix, A-1]                            length of a string, 18, 310-312           Sun (R) Microsystems,        Inc., 5
Square of a graph, 626                          palindrome, 319                          Superimposed, 815
Stabilizer, 785                                 powers of strings, 312                   Superset, 138
Stack, 490-492, 507, 605                        prefix, 312, 313, 315                    Suppes, Patrick C., 189
Staircase paths, 9, 130, 132                    proper prefix, 312, 315                  Surjective function, 260
Stanat, Donald F., 333, 334, 378, 507,          proper substring, 313                    Switches (in a network), 64-66, 551
   508                                          proper suffix, 312                       Switches in series, 65
Standard deviation, 180, 182, 183               reversal, 317, 319                       Switching circuits, 742
Standard form of a Latin square, 816, 817       substring, 313, 315, 328, 338            Switching function, 711, 712, 719, 742
Stanley, Richard Peter, 444, 507, 508           suffix, 312, 313, 315                    Switching network, 64-66
Star of David, 475                           Strongly connected component, 352           Sylow, Ludwig, 795
Starting state, 320, 324, 329                Strongly connected directed graph, 351,     Sylvester, James Joseph, 411, A-11
State diagram, 321, 324, 327                    539                                      Symbolic logic, 118
State table, 321, 322, 324, 331, 372-375     Strongly connected machine, 331             Symbolic Logic, 188
State transition, 320, 321                   Structured programming, 203                 Symmetric Boolean function, 744
Statement, 47-121, 127, 140, 311             Subboard, 404, 405, 408, 409                Symmetric difference, 136, 313
  compound    statement, 48, 49              Subfield, 809, 811, 812                     Symmetric group (S,,), 787, 789, 794,
  contradiction, 53, 58                      Subgraph, 521, 523, 525, 582, 588              830
  contrapositive, 62, 63, 92-94              Subgraph induced by a set of vertices,      Symmetric property (of a relation),
  converse, 62, 63, 82, 99                      522                                        339-343, 347, 348, 353, 366-369,
  definitions, 52, 87, 98, 103-105, 113      Subgroup, 748, 749, 756-758                   376, 377
  dual (of a) statement, 59, 62              Subgroup generated by a group element,      Syndrome, 771, 775-777, 779; see also
  if-then decision structure, 51                754                                        Algebraic coding theory
  if-then-else decision structure, 51        Sublist, 450, 606, 607                      Syndrome decoding, 779
  inverse, 62, 63, 82, 92-94, 99             Submachine, 331, 682                        System of congruences, 702, 707, 708
  logical equivalence, 56-61, 83, 95         Subring, 682-684, 699, 702                  System of distinct representatives, 663,
  logical implication, 69-73, 75, 89, 91,    Subsections of strings, 312                   668
     95                                      Subsequence, A-26                           System of linear equations, A-18, A-19
  logically equivalent statements, 56, 58,   Subset, 124-128, 130-132, 138, 140,         System of recurrence relations, 486, 487
     61-64, 74, 91, 97, 98                      141, 149                                 Systematic form, 778; see also Algebraic
  logically implies, 69, 92                  Subset relation, 250, 358, 359, 362, 363,      coding theory
  negation, 48                                  737                                      Szekeres, George, 276
  negation of quantified statements, 92,     Subsets with no consecutive integers, 457
     96, 97                                  Substitution rules (in logic), 60-62, 69,   t, (the nth triangular number), 198
  open statement, 86, 87, 89-92,      105,      71, 72, 76, 80                           To (tautology), 53
     106, 109, 123, 126, 194, 195            Substring, 313, 315, 328, 338               ©, 294
  primitive statement, 48                    Subtraction, 137, 224, 225, 227, 228, 356   Table for a relational data base, 271
  quantified statement, 87                   Subtraction (in a ring), 680                Table for decoding, 774-776; see also
  tautology, 53, 58, 59, 69                  Subtree, 488, 583, 588, 590, 593-596,          Algebraic coding theory
  theorem, 105, 106, 110, 112, 119              602                                      Table of Big-Oh forms, 293
  truth tables, 49                           Succ (successor) function, 307              Table of identities for generating
Statistics, 3, 33, 175, 188, 815             Success, 161, 178, 179, 182                   functions, 424
Stein, Clifford, 504, 507, 624, 625, 638,    Successor, 243, 307                         Table of particular solutions for the
   643, 654, 668                             Such that, 124                                 method of undetermined coefficients,
Steiner triple system, 829                   Sufficient condition, 48                       479
Steinhaus, Hugo Dynoizy, 506, 508            Suffix, 312, 313, 315, 338                  Table of rules for negating statements
Stern, R. G., 562, 575, 576                  Suffix function, 318                           with one quantifier, 96
Stifel, Michel, 42                           Sum of atoms, 738                           Table of rules of inference, 78
Stillwell, John, 794, 797, 831, 832          Sum of bits, 720, 721                       Table of Stirling numbers of the second
Stinson, Douglas R., 831                     Sum of Boolean functions, 712                  kind, 264
Stirling, James, 303                         Sum of a geometric series, 476              Tabular form, 71
Stirling numbers of the first kind, 267      Sum of matrices, A-12                       Tabulation algorithm, 742
Stirling numbers of the second kind, 29,     Sum of minterms, 717                        Tallahassee, 17
   260, 263-265, 303, 304, 370, 508, 587     Sum of squares, 200                         Taniyama, Yutaka, 706
Stirling’s formula, 304                      Sum of the weights of the edges, 631        Tarry, G., 819
Stoll, Robert R., 119, 120                   Summation, 17, 18                           Tartaglia, Niccolo,    188
Storage circuits, 5                            index,   17                               Taubes, G., 795, 797
Strang, Gilbert, A-21                          lower limit, 17                           Tautology, 53, 58-61, 67, 69, 71, 76, 113
Street, Anne Penfold, 796, 797, 831, 832       upper limit, 17                           Taylor, Richard, 706
String, 12, 18, 19, 128, 129, 309-323,       Summation formulas, 32, 33, 35, 47, 196,    Telephone communication system, 320
   328, 337, 338, 609, 610, 761                197, 200, 259, 430, 441, 470              Terminal (in a switching network), 64
  concatenation, 311, 312                    Summation notation, 17, 18                  Terminal vertex, 588
  empty string, 310, 323                     Summation       operator, 440, 441          Terminals, 552
  equality of strings, 311                   Summations, 292                             Terminating      vertex, 349, 514
  A (the empty string), 310, 323             Sumo wrestlers, 277                         Terminus, 349, 514
1-22            Index

Ternary operation, 306                        backtrack, 653, 656                       complement of a subgraph, 586
Ternary strings, 469                          backward edge, 650, 651, 654              complete binary tree, 589, 595, 596,
Tetrahedron, 547, 548, 792                    capacity, 644                                600, 605
Thatcher, Margaret, 74                        capacity for a vertex, 657                complete binary tree for a set of
Theorem, 53, 67, 70, 84, 87, 98, 99, 105,     capacity of a cut, 646, 665                  weights, 612
   106, 110 ,112, 113, 117, 119, 193, 222     capacity of an edge, 644, 645, 650, 654   complete m-ary tree, 600-602
Théorie Analytique des Probabilités, 443      c(P, P), 646, 648, 652, 654               complete ternary tree, 603
Theory of equations, 411]                     chain, 650                                decision tree, 602, 603
Theory of graphs, see Graph theory            conservation condition, 645, 651          definition, 581
Theory of groups, see Group theory            cut, 645-648, 652, 661, 662               depth-first search, 597, 598, 600, 617,
Theory of languages, 18, 332, 337             definition, 644                              624
Theory of matrices, 411                       Edmonds-Karp algorithm, 653-657           depth-first search algorithm, 597, 598,
Theory of numbers, see Number theory          f-augmenting path, 650-654, 656, 663         617
Theory of rings, see Ring theory              flow in a network, 644-654                depth-first spanning tree, 615-620
Theory of sets, see Set theory                Ford-Fulkerson algorithm, 654-657         descendants, 588, 616-619
Theory of types, 187                          forward edge, 650, 651, 654, 655          dfi(v), 616, 619-621
Sx, 88                                        Max-Flow Min-Cut Theorem, 649,            dictionary order, 589
There exists an x such that, 88                   652                                   directed tree, 587
Therefore (.".), 71                           maximal flow, 645, 647                    Fibonacci tree, 626
Third Reich, 333                              network, 644                              forest, 581, 639, 641, 642
Third-order linear homogeneous                quasi-path, 650                           full binary tree, 611
   recurrence relations with constant          saturated edge, 645, 649, 650            full m-ary tree, 614
   coefficients, 463, 464                      semipath, 650-653                        graceful tree, 627
Thomas, R., 575, 576                           sink, 644-646, 648, 653                  grandparent, 593
Thompson, John, 795                            source, 644-646, 648, 653                height, 601-603, 611
Tile, 470                                      tolerance, 650                           Huffman tree, 613, 614
Tiling, 464                                   unsaturated edge, 645                     inorder (traversal), 594, 595
Time complexity function, 290, 297-299,        usable edge, 653, 655, 656               internal vertices, 588, 591, 593, 601,
  450, 452, 496, 498, 500, 501,                val( f ), 645-648                           612
  605-609, 624; see also Computational         value of a flow, 645-649, 651-653        Kruskal’s algorithm, 639-641
  complexity                                Transpose of a matrix, 348                  labeled complete binary tree, 610
Time complexity function for the bubble     Transposition of a Ferrers graph, 435       labeled tree, 586
   sort, 450-452                            Trappe, Wade, 693, 708, 795, 797            leaf, 588
Time complexity function for the merge      Traveling Salesman Problem, 562, 574        left child, 590, 594, 610, 611
   sort, 607-609                            Treatise on Algebra, 186                    left subtree, 590, 592, 594-596
Top (of a stack), 490                       Tree, 250, 488, 489, 573, 581-629, 641,     level, 588, 589, 593, 597, 607, 611
Top-down approach, 41                          642, 653, 655, 656, 796; see also        level number, 588, 601, 602, 612
Topological sorting, 359, 377                  Graph theory                             lexicographic order, 589
Topological sorting algorithm, 360, 361,      algorithm for articulation points, 619,   m-ary tree, 600
   363                                           620                                    merge sort algorithm, 496, 608
Total order, 359-361, 377                     algorithm for constructing a Huffman      minimal   spanning tree, 639, 667, 668
Totally ordered poset, 359                       tree, 613                              null child, 594, 595
Tournament, 559, 602                          algorithm for counting labeled trees,     optimal spanning tree, 638, 639, 642
Towers of Hanoi, 472, 505                        586, 587                               optimal tree, 612, 613, 640-642
Tolerance, 650                                algorithm for the universal address       order for the vertices of a tree, 588,
Trail, 516, 517, 528                             system, 589                               589, 592-595
Transfer sequence, 331                        ancestors, 588, 616-619                   ordered binary tree, 488
Transfinite cardinal number,    303           articulation point, 615-621, 624          ordered rooted tree, 588
Transform, 253                                back edge, 616-619, 621                   parent, 588, 593, 597, 613, 619-621
Transformation, 36, 37                        backtrack(ing), 593, 596-598, 600,        pendant vertex, 583, 584
Transient state, 330                             616                                    postorder (traversal), 592-595
Transition, 320                               balanced tree, 601, 602                   prefix code, 609, 611, 613, 614, 624
Transition sequence, 331                      biconnected component, 615,               preorder (traversal), 592-596
Transition state, 321                            619-621, 624                           Prim’s algorithm, 641-643, 653
Transition table, 321                         binary rooted tree, 589, 590, 594, 595    quick sort, 609
Transitive property (of a relation),          binary tree, 488, 595, 600                right child, 590, 594, 610, 611
   339-343, 347, 348, 353, 357, 358,          branch nodes, 588                         right subtree, 592, 594-596, 614
   366-368, 376, 377                          branches, 488, 614                        root, 587-590
Transmission errors, 762, 767                 breadth-first search, 598-600             rooted Fibonacci tree, 626
Transmission of digital signals, 188          breadth-first search algorithm, 598,      rooted tree, 587-596, 600, 601
Transmitter, 767, 769                            599                                    sibling, 588, 593, 612
Transport network, 324, 644-658,              breadth-first spanning tree, 599          sorting, 581, 605, 606, 608
   660-663, 665, 667, 668                     caterpillar, 627, 628                     spanning forest, 582
   a-z cut, 645                               characteristic sequence, 625              spanning tree, 582, 596, 597, 599, 631,
   associated undirected graph, 645, 650      child, 588, 590, 594, 598, 617-620          638, 640
                                                                                                                     Index             1-23

spine (of a caterpillar), 627, 628           Uniqueness of inverses                       Vertices of a graph, 349, 514
  subtrees, 583, 588, 590, 593-596, 602          for a group, 747                              adjacent vertices, 349, 514
  terminal vertex, 588                           for a ring, 680, 681                         isolated vertex, 349, 514
  universal address system, 589               Unit circle, A-28                                origin, 349, 514
  W(T), 612                                   Unit delay machine, see One-unit delay           source, 349, 514
  weight of a tree, 612                          machine                                       terminating vertex, 349, 514
  weights for an optimal tree, 612            Unit in a ring, 677, 681, 689, 700               terminus, 349, 514
Tree diagram, 154, 157, 248-250, 331,         Unit-interval graph, 520                      Video-display terminal, 155
   488                                        Unity in a ring, 675, 681, 700                Von Dyck, Walther Franz Anton, 794
Tree traversal, 594                           Unity of a Boolean algebra, 733, 739          Von Ettinghausen, Andreas, 42
Tremblay, Jean-Paul, 704, 708                 Universal address system, 589                 Von Koch, Helge, 475
Trend, 33                                     Universal generalization,     110,   111      Von Neumann, John, 689
Trial, 179                                    Universal quantifier, 87, 88, 90, 96, 98,     Von Staudt, Karl, 622
Triangle inequality, 767                         124                                        Vorlesungen iiber die Algebra der Logik,
Triangular number, 193, 198, 482, 572         Universally quantified statement, 107,           119
Triangulation (of a convex polygon), 494         108, 110-112
Trigonometric series, 303                     Universal set, 523                            W,,, 520, 572
Triple, 248                                   Universal specification, 106, 111             W(T),    612
Triple repetition code, 765, 768, 769; see    Universe, 87, 90-92, 106, 123-128, 138,       Wakerly, John F,, 742, 743
   also Algebraic coding theory                  139, 149, 161                              Walk, 515, 516
Triple system, 829                            Universe of discourse, 87, 124                Walker, Elbert A., 707, 708
Trivial subgroup, 748                         Unsaturated edge, 645                         Wallis, W. D., 796, 797, 831, 832
Trivial walk, 515                             Unspecified outputs, 731                      Walser, Hans, 506, 508
Trotter, H. F., 506, 508                      Upper bound, 363                              Wand, Mitchell, 244
Tune     (truncation) function, 254           Upper limit                                   Washington, Lawrence C., 693, 708, 795,
Truth tables, 49, 52, 55-59, 62, 70, 143         in product notation, 239                      797
Truth value, 48, 49, 69, 82                      ofa summation,     17                      Weaver, W., 797
                                              Uranium, 486                                  Weight of an edge, 631, 638, 644
T-shaped figure, 121
Tucker, Alan, 42, 43, 412, 444, 445, 796,     U.S. Navy, 357, 377                           Weight of a string, 18
                                                                                            Weight of a tree, 612
   797                                        U.S.S. Constitution, 623
                                                                                            Weight ofx (in coding theory), 766; see
Turing, Alan Mathison, 333                    Usable edge, 653, 655, 656
                                                                                              also Algebraic coding theory
Turing machine, 333                           User-interface, 155
                                                                                            Weighted directed graph, 644
Tutte, W. T., 573                             Utility graph, 542
                                                                                            Weighted graph, 631-634, 636, 637,
Two-byte address, 5
                                                                                               640-642, 667
Two-dimensional array, 101                     (v, b, r, k, A)-design, 825, 826, 831        Weights (for an optimal tree), 611, 612
Two-dimensional       motions, 749             Vajda, S., 506, 508                          Well-defined binary operation, 687
2-isomorphic graphs, 555                      Val( f), 645-649, 651-653, 656                Well-defined (in set theory), 123
2-methyl propane, 584                          Valid argument, 47, 53, 67-71, 111; see      Well-formed formulae, 220
Two-state device, 711                             also Proof                                Well-ordered set, 194
Two-unit delay machine, 329                   Validity of an argument,    70, 71, 73, 76,   Well-Ordering Principle,         193, 194, 222,
Two-valued logic, 711                            77, 79-83, 99, 103, 109, 112                  223, 231, 236
Two’s complement method, 227, 228,            Value of a flow, 645-649                      West, Douglas B., 543, 573, 574, 576
  230                                         Van Gelder, Allen, 305, 624, 625, 641,        Weston, J. Harley, 412
Tymoczko, Thomas,       575, 576                 642, 667, 668                              What the Tortoise Said to Achilles, 119
                                              Van Slyke, R., 642, 668, 669                  Wheel graph, 519, 520, 572
Uliman, Jeffrey David, 333, 334, 378,          Var(X), 180-184                              Wheel    of fortune,   196
   506, 507, 574, 575, 623, 624, 642,          Variable, 86-88                              Wheel with n spokes, 520, 572
   667, 668, 708                                 bound, 88                                  Whitehead, Alfred North, 119, 187
UltraSPARC processor, 5                          free, 88                                   Whitney, Hassler, 573
Unary operation, 138, 267, 268, 733           Variable coefficient, 452, 487                Whitworth, William Allen, 42, 43, 411,
Uncountable set, 164, 303, A-28, A-29,        Variance, 177, 180                               412
  A-32                                        Varieties, 825-827                            Wilder, Raymond L., 119, 120, 304, 305
Undirected edge, 351                          VDT, 155                                      Wiles, Andrew John, 705, 706
Undirected graph, 350-352, 396, 480,          Veblen, O., 831                               Wilf, Herbert S., 444, 445
  488, 514, 515, 615-619, 639-642,            Vector space, 624                             Wilson, John, 752
   699, 730                                    Vectors, 694                                 Wilson, Robin J., 574, 575
Uniform discrete random variable, 185,        Veitch, E. W., 742, 743                       Wilson’s Theorem,       752, 798
   209                                        Velleman, Daniel J., 304, 305                 Wimbledon, 249, 601
Union Construct (in C++), 369                 Venn, John, 141, 188                          Without replacement, 15
Union of graphs, 570                          Venn diagram, 141-144, 146, 148, 155,         Woltz, Jack,    186
Union of sets, 136,    138, 213, 248, A-29,      161, 168, 169, 188, 385, 386, 393,         Wood, Derick, 333, 334
  A-31                                           398, 411                                   Word, 310
Uniqueness of complements (inverses)          Vertex degree, 530                            World War II, 333
  for a Boolean algebra, 736                  Vertex set, 349, 514                          Worst-case complexity, 295, 296
1-24           Index

Worst-case time-complexity function,       wt (x) (for a string x), 18          Zero element of a Boolean algebra, 733,
   503, 605-609, 636, 637, 640-642,        Wyman, M., 493, 507                     738, 739
   654, 668                                                                     Zero element of a ring, 674, 679, 699,
Wrapped around, 536, 724                   Xenocrates (of Chalcedon), 41           701
Wright, Charles R. B., 119, 120                                                 Zero-one matrix, 247, 344, 345, 347,
Wright, Edward Maitland, 244, 412          Youse, Bevan K., 244                    348, 352
wt(a, b), 631                                                                   (0, 1)-matrix, 345, 347, 348, 352, 378;
wt(e) (for an edge e), 631, 632, 638,      Z, Z*, 133, 134                         see also Zero-one matrix
  639, 641                                 Z,,, 134, 686                        Zero polynomial, 802
wt (x) (in coding theory), 766; see also   Zariski, Oscar, 707, 708, $31, 832   Zuckerman, Herbert Samuel, 243, 244,
   Algebraic coding theory                 Zero element, A-13                      444 445,708
FORMULAS
           n!                                               n factorial: O! = I; nm! =n(n — 1)--- 3)(2)(1),n € Zt
            P(n,r)                                          the number of permutations of n objects taken r at a time,
                                                                  O<r<n.[P(n,r)                =nl/(n—-r)!]
            C(n, r) = (”)                                   the number of combinations or selections of n objects taken
                                                                  ratatime,O<r<n.[C(n,r)                 =na!/[ri(n —r)!]]
            trot)                                           the number of combinations or selections of 1 objects taken
                                                                  r ata time, with Tepetitions allowed (r > 0)
           The Binomial Theorem:                          = (x + y)” = (§)x°y" + (t)xty? 1 4--- + (R)x"y®

(Y= OF(E                                        mere
            S(m, n) = (1/n!) Y-o(—1)*(,,",)(@ — &)”, a Stirling number of the second kind.
           S(m, n) is the number of ways to distribute m distinct objects among n identical
           containers with no container left empty.

Gs),                                            rent
            f(x) = ay + a,x + anx? +.a3x3 +--+: f(x) is the (ordinary) generating function
           for the sequence ap, a1, a2, @3,...

Fora éR,m,ne Zt

(1+ xy" = (6) + (i)x + (x?
                                                                 + + Gx"
                                (1 +ax)"
                                      = (5) + (ax
                                                + G)arx?                                       +--+ (a™x"
                                (1+ xy = 6) + (Da + (Be bot (at
                     (—-x" )/(-—x)=ltx+t+x?+---+x"
                                1/1 —x)=1l+x4+2?4+23                                           =x

YA -ax" = (oC) + GOO +G Non                                              (S')ay+
                                                           a              \(-—x)i   = Sot           x!

i=0                          i=0

g(x) = ap +.a,(x/1!) + ag(x?/2!) + a3(x3/3!) +--+: g(x) is the exponential
           generating function for the sequence ag, a1, a2, 43, ...
                                     2           3
                    =l+x4       424...
           “                    2"          3i
                1           _                        x2          x4                  1                        ey)
            Ge         +e        alt                 Dt          Ge                 (Je        —e    yextatate

the n-th Fibonacci number: Fo = 0, F; = 1; and
                                                            Fy        =   Fy-1
                                                                           + Fr-2,n > 2
           b,,n>0                                           the n-th Catalan number: b, = (-H)                C"),n>0
NOTATION
SPECIAL SETS OF                                     the set of integers: {0, 1, -—1, 2, —2, 3, —3,...}
       NUMBERS                                      the set of nonnegative integers or natural numbers:
                                                       {0, 1,2, 3,...}
                                                    the set of positive integers: {1, 2,3, ...} = {x € Z|x > 0}
                                                    the set of rational numbers: {a/bla, b € Z, b # 0}
                                                    the set of positive rational numbers
                                                    the set of nonzero rational numbers
                                                    the set of real numbers
                                                    the set of positive real numbers
                                                    the set of nonzero real numbers
                                                    the set of complex numbers: {x + yilx, y € R, i? = —1}
                                                    the set of nonzero complex numbers
                                                    {0,1,2,...,2—1},       forne Zt
                                                    the closed interval from a to b: {x € Rla < x < b}
                                                    the open interval from a to b: {x € R\a < x < b}
                                                    a half-open interval from a to b: {x € Rla < x < b}
                                                    a half-open interval from a to b: {x € Ria < x < b}

ALGEBRAIC                                       R is a ring with binary operations + and -
   STRUCTURES                                       the ring of polynomials over ring R
                                                    the degree of the polynomial f(x)
                                                    G is a group under the binary operation o
                                                    the symmetric group on n symbols
                                                    a left coset of subgroup H (in group G): {ah|h € H}
                  (B,     +,   "y a” 0,   1)        the Boolean algebra & with binary operations + and -, the unary
                                                       operation —, and identity elements 0 (for +) and 1 (for +)

GRAPH THEORY      G = (V, E)                        G is a graph with vertex set V and edge set E
                  Kn                                the complete graph on n vertices
                  G                                 the complement of graph G
                  deg(v)                            the degree of vertex v (in an undirected graph G)
                  od(v)                             the out degree of vertex v (in a directed graph G)
                  id(v)                             the in degree of vertex v (in a directed graph G)
                  k(G)                              the number of connected components of graph G
                  On                                the n-dimensional hypercube: the n-cube
                  Kinn                              the complete bipartite graph on V = V, U V2 where
                                                       ViN V2 = 8, [Vil =m, |V2| = 71
                  B(G)                              the independence number of G
                  x(G)                              the chromatic number of G
                  P(G, A)                           the chromatic polynomial of G
                  y(G)                              the domination number of G
                  L(G)                              the line graph of G
                  T =(V, E)                         T is a tree with vertex set V and edge set E
                  N =(V, E)                         N is a (transport) network with vertex set V and edge set E

Download   more eBooks here: htto://avaxhome.cc/blogs/ChrisRedfield